Tag #crawling
60 papers:
FDG-2018-ZhangZHS #game studies- Crawling, indexing, and retrieving moments in videogames (XZ, ZZ, MH, AMS), p. 10.
JCDL-2017-BrunelleWN #javascript- Archival Crawlers and JavaScript: Discover More Stuff but Crawl More Slowly (JFB, MCW, MLN), pp. 1–10.
JCDL-2015-GossenDR #named #social #web- iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling (GG, ED, TR0), pp. 75–84.
SEKE-2015-NakstadWF #gesture #interactive- Finding and Emulating Keyboard, Mouse, and Touch Interactions and Gestures while Crawling RIA’s (FN, HW, YF), pp. 631–638.
HT-2014-AgarwalS #mining- A focused crawler for mining hate and extremism promoting videos on YouTube (SA, AS), pp. 294–296.
HT-2014-GouritenMS #adaptation #scalability- Scalable, generic, and adaptive systems for focused crawling (GG, SM, PS), pp. 35–45.
ICEIS-v2-2014-GomesCLL #linked data #metadata #open data- A Metadata Focused Crawler for Linked Data (RdVAG, MAC, GRL, LAPPL), pp. 489–500.
CIKM-2014-MeuselMB - Focused Crawling for Structured Data (RM, PM, RB), pp. 1039–1048.
ECIR-2014-OstroumovaBCTG #policy #predict #web- Crawling Policies Based on Web Page Popularity Prediction (LO, IB, AC, AT, GG), pp. 100–111.
ECIR-2014-PereiraMCM #web- Time-Aware Focused Web Crawling (PP, JM, OC, HM), pp. 534–539.
FASE-2014-StruberRTC #information retrieval #modelling #using- Splitting Models Using Information Retrieval and Model Crawling Techniques (DS, JR, GT, MC), pp. 47–62.
CIKM-2013-LefortierOSS - Timely crawling of high-quality ephemeral new content (DL, LO, ES, PS), pp. 745–750.
PDP-2013-CamposRMM #distributed- Distributed Ontology-Driven Focused Crawling (RC, OR, MM, MM), pp. 108–115.
VLDB-2012-GoodrichNOPTTL #authentication #performance #verification #web- Efficient Verification of Web-Content Searching Through Authenticated Web Crawlers (MTG, DN, OO, CP, RT, NT, CVL), pp. 920–931.
VLDB-2012-ShengZTJ #algorithm #database #web- Optimal Algorithms for Crawling a Hidden Database in the Web (CS, NZ, YT, XJ), pp. 1112–1123.
CIKM-2012-VuralCS #sentiment #web- Sentiment-focused web crawling (AGV, BBC, PS), pp. 2020–2024.
SAC-2012-FerreiraBMCLF #architecture #framework- An architecture-centered framework for developing blog crawlers (RF, PHdSB, JM, EC, RL, FLGdF), pp. 1131–1136.
SAC-2012-FerreiraLMCFL #framework #named- RetriBlog: a framework for creating blog crawlers (RF, RL, JM, EC, FLGdF, HPLL), pp. 696–701.
ICST-2012-ChoudharyPO #detection #difference #named #web- CrossCheck: Combining Crawling and Differencing to Better Detect Cross-browser Incompatibilities in Web Applications (SRC, MRP, AO), pp. 171–180.
TPDL-2011-SaadPG #navigation #using #web- Coherence-Oriented Crawling and Navigation Using Patterns for Web Archives (MBS, ZP, SG), pp. 421–433.
CIKM-2011-BarbosaB #modelling- Focusing on novelty: a crawling strategy to build diverse language models (LB, SB), pp. 755–764.
CIKM-2011-LiuCZZ #behaviour #web- User browsing behavior-driven web crawling (ML, RC, MZ, LZ), pp. 87–92.
CIKM-2010-FengZXY #rank #using- Focused crawling using navigational rank (SF, LZ, YX, CY), pp. 1513–1516.
CIKM-2010-UrbanoLAM #documentation #web- Crawling the web for structured documents (JU, JL, YA, MM), pp. 1939–1940.
SAC-2010-PirkolaT #approach #problem #using- Addressing the limited scope problem of focused crawling using a result merging approach (AP, TT), pp. 1735–1740.
CIKM-2009-AhlersB #adaptation- Adaptive geospatially focused crawling (DA, SB), pp. 445–454.
CIKM-2009-ZhengDG #graph- Graph-based seed selection for web-scale crawlers (SZ, PD, CLG), pp. 1967–1970.
KDD-2009-YangCWHZM #incremental #web- Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy (JMY, RC, CW, HH, LZ, WYM), pp. 1375–1384.
KDIR-2009-LopesPCO #named #web- Arabella — A Directed Web Crawler (PL, DP, DC, JLO), pp. 270–273.
VLDB-2008-DudaFKZ #named #web- AJAXSearch: crawling, indexing and searching web 2.0 applications (CD, GF, DK, CZ), pp. 1440–1443.
SEKE-2008-LeeIHZ #adaptation #design- Design of an RSS Crawler with Adaptive Revisit Manager (BSL, JWI, BYH, DZ), pp. 219–222.
SIGIR-2008-GuanWCBW #effectiveness #estimation #online #topic #using- Guide focused crawler efficiently and effectively using on-line topical importance estimation (ZG, CW, CC, JB, JW), pp. 757–758.
SIGIR-2008-WangYLCZM #traversal #web- Exploring traversal strategy for web forum crawling (YW, JMY, WL, RC, LZ, WYM), pp. 459–466.
SAC-2008-AssisLSG - The impact of term selection in genre-aware focused crawling (GTdA, AHFL, ASdS, MAG), pp. 1158–1163.
PDP-2008-MarinB #clustering #online- Bulk-Synchronous On-Line Crawling on Clusters of Computers (MM, CB), pp. 414–421.
VLDB-2007-ChoS #rank- RankMass Crawler: A Crawler with High PageRank Coverage Guarantee (JC, US), pp. 375–386.
CIKM-2007-TanMG #clustering #design #policy #web- Designing clustering-based web crawling policies for search engine crawlers (QT, PM, CLG), pp. 535–544.
ICML-2007-BabariaNKSBM #scalability- Focused crawling with scalable ordinal regression solvers (RB, JSN, SK, KRS, CB, MNM), pp. 57–64.
ASE-2007-CaiGH #modelling #performance #web- Synthesizing client load models for performance engineering via web crawling (YC, JCG, JGH), pp. 353–362.
HT-2006-McCownN #evaluation #policy- Evaluation of crawling policies for a web-repository crawler (FM, MLN), pp. 157–168.
SIGIR-2006-VidalSMC #generative- Structure-driven crawler generation by example (MLAV, ASdS, ESdM, JMBC), pp. 292–299.
ECDL-2005-AlmpanidisKP #semantics #using- Focused Crawling Using Latent Semantic Indexing — An Application for Vertical Search Engines (GA, CK, IP), pp. 402–413.
JCDL-2005-ZhuangWG #documentation #library #what- What’s there and what’s not?: focused crawling for missing documents in digital libraries (ZZ, RW, CLG), pp. 301–310.
CIKM-2005-TangHCG #quality #topic- Focused crawling for both topical relevance and quality of medical information (TTT, DH, NC, KG), pp. 147–154.
JCDL-2004-PantTJG #library #named #topic- Panorama: extending digital libraries with topical crawlers (GP, KT, JJ, CLG), pp. 142–150.
JCDL-2004-QinZC #library #web- Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method (JQ, YZ, MC), pp. 135–141.
VLDB-2004-EsterKS #performance- Accurate and Efficient Crawling for Relevant Websites (ME, HPK, MS), pp. 396–407.
ECDL-2003-PantM #topic- Topical Crawling for Business Intelligence (GP, FM), pp. 233–244.
VLDB-2003-SizovGT #framework #generative #web- From Focused Crawling to Expert Information: an Application Framework for Web Exploration and Portal Generation (SS, JG, MT), pp. 1105–1108.
ICML-2003-JohnsonTG #evolution #web- Evolving Strategies for Focused Web Crawling (JJ, KT, CLG), pp. 298–305.
SAC-2003-EhrigM #documentation #web- Ontology-Focused Crawling of Web Documents (ME, AM), pp. 1174–1178.
JCDL-2002-LiuMZN #named #web- DP9: an OAI gateway service for web crawlers (XL, KM, MZ, MLN), pp. 283–284.
CIKM-2002-ChungC #collaboration #topic- Topic-oriented collaborative crawling (CC, CLAC), pp. 34–42.
KDD-2002-Aggarwal02a #case study #collaboration #experience #mining #resource management #topic #user interface- Collaborative crawling: mining user experiences for topical resource discovery (CCA), pp. 423–428.
STOC-2002-CooperF #graph #web- Crawling on web graphs (CC, AMF), pp. 419–427.
JCDL-2001-Burke #library #named- Salticus: guided crawling for personal digital libraries (RDB), pp. 88–89.
VLDB-2001-RaghavanG #web- Crawling the Hidden Web (SR, HGM), pp. 129–138.
SIGIR-2001-MenczerPSR #topic #web- Evaluating Topic-Driven Web Crawlers (FM, GP, PS, MER), pp. 241–249.
VLDB-2000-ChoG #evolution #incremental #web- The Evolution of the Web and Implications for an Incremental Crawler (JC, HGM), pp. 200–209.
VLDB-2000-DiligentiCLGG #graph #using- Focused Crawling Using Context Graphs (MD, FC, SL, CLG, MG), pp. 527–534.