BibSLEIGH
BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
EDIT!
CC-BY
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
crawling
Google crawling

Tag #crawling

60 papers:

FDGFDG-2018-ZhangZHS #game studies
Crawling, indexing, and retrieving moments in videogames (XZ, ZZ, MH, AMS), p. 10.
JCDLJCDL-2017-BrunelleWN #javascript
Archival Crawlers and JavaScript: Discover More Stuff but Crawl More Slowly (JFB, MCW, MLN), pp. 1–10.
JCDLJCDL-2015-GossenDR #named #social #web
iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling (GG, ED, TR0), pp. 75–84.
SEKESEKE-2015-NakstadWF #gesture #interactive
Finding and Emulating Keyboard, Mouse, and Touch Interactions and Gestures while Crawling RIA’s (FN, HW, YF), pp. 631–638.
HTHT-2014-AgarwalS #mining
A focused crawler for mining hate and extremism promoting videos on YouTube (SA, AS), pp. 294–296.
HTHT-2014-GouritenMS #adaptation #scalability
Scalable, generic, and adaptive systems for focused crawling (GG, SM, PS), pp. 35–45.
ICEISICEIS-v2-2014-GomesCLL #linked data #metadata #open data
A Metadata Focused Crawler for Linked Data (RdVAG, MAC, GRL, LAPPL), pp. 489–500.
CIKMCIKM-2014-MeuselMB
Focused Crawling for Structured Data (RM, PM, RB), pp. 1039–1048.
ECIRECIR-2014-OstroumovaBCTG #policy #predict #web
Crawling Policies Based on Web Page Popularity Prediction (LO, IB, AC, AT, GG), pp. 100–111.
ECIRECIR-2014-PereiraMCM #web
Time-Aware Focused Web Crawling (PP, JM, OC, HM), pp. 534–539.
FASEFASE-2014-StruberRTC #information retrieval #modelling #using
Splitting Models Using Information Retrieval and Model Crawling Techniques (DS, JR, GT, MC), pp. 47–62.
CIKMCIKM-2013-LefortierOSS
Timely crawling of high-quality ephemeral new content (DL, LO, ES, PS), pp. 745–750.
PDPPDP-2013-CamposRMM #distributed
Distributed Ontology-Driven Focused Crawling (RC, OR, MM, MM), pp. 108–115.
VLDBVLDB-2012-GoodrichNOPTTL #authentication #performance #verification #web
Efficient Verification of Web-Content Searching Through Authenticated Web Crawlers (MTG, DN, OO, CP, RT, NT, CVL), pp. 920–931.
VLDBVLDB-2012-ShengZTJ #algorithm #database #web
Optimal Algorithms for Crawling a Hidden Database in the Web (CS, NZ, YT, XJ), pp. 1112–1123.
CIKMCIKM-2012-VuralCS #sentiment #web
Sentiment-focused web crawling (AGV, BBC, PS), pp. 2020–2024.
SACSAC-2012-FerreiraBMCLF #architecture #framework
An architecture-centered framework for developing blog crawlers (RF, PHdSB, JM, EC, RL, FLGdF), pp. 1131–1136.
SACSAC-2012-FerreiraLMCFL #framework #named
RetriBlog: a framework for creating blog crawlers (RF, RL, JM, EC, FLGdF, HPLL), pp. 696–701.
ICSTICST-2012-ChoudharyPO #detection #difference #named #web
CrossCheck: Combining Crawling and Differencing to Better Detect Cross-browser Incompatibilities in Web Applications (SRC, MRP, AO), pp. 171–180.
TPDLTPDL-2011-SaadPG #navigation #using #web
Coherence-Oriented Crawling and Navigation Using Patterns for Web Archives (MBS, ZP, SG), pp. 421–433.
CIKMCIKM-2011-BarbosaB #modelling
Focusing on novelty: a crawling strategy to build diverse language models (LB, SB), pp. 755–764.
CIKMCIKM-2011-LiuCZZ #behaviour #web
User browsing behavior-driven web crawling (ML, RC, MZ, LZ), pp. 87–92.
CIKMCIKM-2010-FengZXY #rank #using
Focused crawling using navigational rank (SF, LZ, YX, CY), pp. 1513–1516.
CIKMCIKM-2010-UrbanoLAM #documentation #web
Crawling the web for structured documents (JU, JL, YA, MM), pp. 1939–1940.
SACSAC-2010-PirkolaT #approach #problem #using
Addressing the limited scope problem of focused crawling using a result merging approach (AP, TT), pp. 1735–1740.
CIKMCIKM-2009-AhlersB #adaptation
Adaptive geospatially focused crawling (DA, SB), pp. 445–454.
CIKMCIKM-2009-ZhengDG #graph
Graph-based seed selection for web-scale crawlers (SZ, PD, CLG), pp. 1967–1970.
KDDKDD-2009-YangCWHZM #incremental #web
Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy (JMY, RC, CW, HH, LZ, WYM), pp. 1375–1384.
KDIRKDIR-2009-LopesPCO #named #web
Arabella — A Directed Web Crawler (PL, DP, DC, JLO), pp. 270–273.
VLDBVLDB-2008-DudaFKZ #named #web
AJAXSearch: crawling, indexing and searching web 2.0 applications (CD, GF, DK, CZ), pp. 1440–1443.
SEKESEKE-2008-LeeIHZ #adaptation #design
Design of an RSS Crawler with Adaptive Revisit Manager (BSL, JWI, BYH, DZ), pp. 219–222.
SIGIRSIGIR-2008-GuanWCBW #effectiveness #estimation #online #topic #using
Guide focused crawler efficiently and effectively using on-line topical importance estimation (ZG, CW, CC, JB, JW), pp. 757–758.
SIGIRSIGIR-2008-WangYLCZM #traversal #web
Exploring traversal strategy for web forum crawling (YW, JMY, WL, RC, LZ, WYM), pp. 459–466.
SACSAC-2008-AssisLSG
The impact of term selection in genre-aware focused crawling (GTdA, AHFL, ASdS, MAG), pp. 1158–1163.
PDPPDP-2008-MarinB #clustering #online
Bulk-Synchronous On-Line Crawling on Clusters of Computers (MM, CB), pp. 414–421.
VLDBVLDB-2007-ChoS #rank
RankMass Crawler: A Crawler with High PageRank Coverage Guarantee (JC, US), pp. 375–386.
CIKMCIKM-2007-TanMG #clustering #design #policy #web
Designing clustering-based web crawling policies for search engine crawlers (QT, PM, CLG), pp. 535–544.
ICMLICML-2007-BabariaNKSBM #scalability
Focused crawling with scalable ordinal regression solvers (RB, JSN, SK, KRS, CB, MNM), pp. 57–64.
ASEASE-2007-CaiGH #modelling #performance #web
Synthesizing client load models for performance engineering via web crawling (YC, JCG, JGH), pp. 353–362.
HTHT-2006-McCownN #evaluation #policy
Evaluation of crawling policies for a web-repository crawler (FM, MLN), pp. 157–168.
SIGIRSIGIR-2006-VidalSMC #generative
Structure-driven crawler generation by example (MLAV, ASdS, ESdM, JMBC), pp. 292–299.
TPDLECDL-2005-AlmpanidisKP #semantics #using
Focused Crawling Using Latent Semantic Indexing — An Application for Vertical Search Engines (GA, CK, IP), pp. 402–413.
JCDLJCDL-2005-ZhuangWG #documentation #library #what
What’s there and what’s not?: focused crawling for missing documents in digital libraries (ZZ, RW, CLG), pp. 301–310.
CIKMCIKM-2005-TangHCG #quality #topic
Focused crawling for both topical relevance and quality of medical information (TTT, DH, NC, KG), pp. 147–154.
JCDLJCDL-2004-PantTJG #library #named #topic
Panorama: extending digital libraries with topical crawlers (GP, KT, JJ, CLG), pp. 142–150.
JCDLJCDL-2004-QinZC #library #web
Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method (JQ, YZ, MC), pp. 135–141.
VLDBVLDB-2004-EsterKS #performance
Accurate and Efficient Crawling for Relevant Websites (ME, HPK, MS), pp. 396–407.
TPDLECDL-2003-PantM #topic
Topical Crawling for Business Intelligence (GP, FM), pp. 233–244.
VLDBVLDB-2003-SizovGT #framework #generative #web
From Focused Crawling to Expert Information: an Application Framework for Web Exploration and Portal Generation (SS, JG, MT), pp. 1105–1108.
ICMLICML-2003-JohnsonTG #evolution #web
Evolving Strategies for Focused Web Crawling (JJ, KT, CLG), pp. 298–305.
SACSAC-2003-EhrigM #documentation #web
Ontology-Focused Crawling of Web Documents (ME, AM), pp. 1174–1178.
JCDLJCDL-2002-LiuMZN #named #web
DP9: an OAI gateway service for web crawlers (XL, KM, MZ, MLN), pp. 283–284.
CIKMCIKM-2002-ChungC #collaboration #topic
Topic-oriented collaborative crawling (CC, CLAC), pp. 34–42.
KDDKDD-2002-Aggarwal02a #case study #collaboration #experience #mining #resource management #topic #user interface
Collaborative crawling: mining user experiences for topical resource discovery (CCA), pp. 423–428.
STOCSTOC-2002-CooperF #graph #web
Crawling on web graphs (CC, AMF), pp. 419–427.
JCDLJCDL-2001-Burke #library #named
Salticus: guided crawling for personal digital libraries (RDB), pp. 88–89.
VLDBVLDB-2001-RaghavanG #web
Crawling the Hidden Web (SR, HGM), pp. 129–138.
SIGIRSIGIR-2001-MenczerPSR #topic #web
Evaluating Topic-Driven Web Crawlers (FM, GP, PS, MER), pp. 241–249.
VLDBVLDB-2000-ChoG #evolution #incremental #web
The Evolution of the Web and Implications for an Incremental Crawler (JC, HGM), pp. 200–209.
VLDBVLDB-2000-DiligentiCLGG #graph #using
Focused Crawling Using Context Graphs (MD, FC, SL, CLG, MG), pp. 527–534.

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.