Tag #crawling
60 papers:
- FDG-2018-ZhangZHS #game studies
- Crawling, indexing, and retrieving moments in videogames (XZ, ZZ, MH, AMS), p. 10.
- JCDL-2017-BrunelleWN #javascript
- Archival Crawlers and JavaScript: Discover More Stuff but Crawl More Slowly (JFB, MCW, MLN), pp. 1–10.
- JCDL-2015-GossenDR #named #social #web
- iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling (GG, ED, TR0), pp. 75–84.
- SEKE-2015-NakstadWF #gesture #interactive
- Finding and Emulating Keyboard, Mouse, and Touch Interactions and Gestures while Crawling RIA’s (FN, HW, YF), pp. 631–638.
- HT-2014-AgarwalS #mining
- A focused crawler for mining hate and extremism promoting videos on YouTube (SA, AS), pp. 294–296.
- HT-2014-GouritenMS #adaptation #scalability
- Scalable, generic, and adaptive systems for focused crawling (GG, SM, PS), pp. 35–45.
- ICEIS-v2-2014-GomesCLL #linked data #metadata #open data
- A Metadata Focused Crawler for Linked Data (RdVAG, MAC, GRL, LAPPL), pp. 489–500.
- CIKM-2014-MeuselMB
- Focused Crawling for Structured Data (RM, PM, RB), pp. 1039–1048.
- ECIR-2014-OstroumovaBCTG #policy #predict #web
- Crawling Policies Based on Web Page Popularity Prediction (LO, IB, AC, AT, GG), pp. 100–111.
- ECIR-2014-PereiraMCM #web
- Time-Aware Focused Web Crawling (PP, JM, OC, HM), pp. 534–539.
- FASE-2014-StruberRTC #information retrieval #modelling #using
- Splitting Models Using Information Retrieval and Model Crawling Techniques (DS, JR, GT, MC), pp. 47–62.
- CIKM-2013-LefortierOSS
- Timely crawling of high-quality ephemeral new content (DL, LO, ES, PS), pp. 745–750.
- PDP-2013-CamposRMM #distributed
- Distributed Ontology-Driven Focused Crawling (RC, OR, MM, MM), pp. 108–115.
- VLDB-2012-GoodrichNOPTTL #authentication #performance #verification #web
- Efficient Verification of Web-Content Searching Through Authenticated Web Crawlers (MTG, DN, OO, CP, RT, NT, CVL), pp. 920–931.
- VLDB-2012-ShengZTJ #algorithm #database #web
- Optimal Algorithms for Crawling a Hidden Database in the Web (CS, NZ, YT, XJ), pp. 1112–1123.
- CIKM-2012-VuralCS #sentiment #web
- Sentiment-focused web crawling (AGV, BBC, PS), pp. 2020–2024.
- SAC-2012-FerreiraBMCLF #architecture #framework
- An architecture-centered framework for developing blog crawlers (RF, PHdSB, JM, EC, RL, FLGdF), pp. 1131–1136.
- SAC-2012-FerreiraLMCFL #framework #named
- RetriBlog: a framework for creating blog crawlers (RF, RL, JM, EC, FLGdF, HPLL), pp. 696–701.
- ICST-2012-ChoudharyPO #detection #difference #named #web
- CrossCheck: Combining Crawling and Differencing to Better Detect Cross-browser Incompatibilities in Web Applications (SRC, MRP, AO), pp. 171–180.
- TPDL-2011-SaadPG #navigation #using #web
- Coherence-Oriented Crawling and Navigation Using Patterns for Web Archives (MBS, ZP, SG), pp. 421–433.
- CIKM-2011-BarbosaB #modelling
- Focusing on novelty: a crawling strategy to build diverse language models (LB, SB), pp. 755–764.
- CIKM-2011-LiuCZZ #behaviour #web
- User browsing behavior-driven web crawling (ML, RC, MZ, LZ), pp. 87–92.
- CIKM-2010-FengZXY #rank #using
- Focused crawling using navigational rank (SF, LZ, YX, CY), pp. 1513–1516.
- CIKM-2010-UrbanoLAM #documentation #web
- Crawling the web for structured documents (JU, JL, YA, MM), pp. 1939–1940.
- SAC-2010-PirkolaT #approach #problem #using
- Addressing the limited scope problem of focused crawling using a result merging approach (AP, TT), pp. 1735–1740.
- CIKM-2009-AhlersB #adaptation
- Adaptive geospatially focused crawling (DA, SB), pp. 445–454.
- CIKM-2009-ZhengDG #graph
- Graph-based seed selection for web-scale crawlers (SZ, PD, CLG), pp. 1967–1970.
- KDD-2009-YangCWHZM #incremental #web
- Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy (JMY, RC, CW, HH, LZ, WYM), pp. 1375–1384.
- KDIR-2009-LopesPCO #named #web
- Arabella — A Directed Web Crawler (PL, DP, DC, JLO), pp. 270–273.
- VLDB-2008-DudaFKZ #named #web
- AJAXSearch: crawling, indexing and searching web 2.0 applications (CD, GF, DK, CZ), pp. 1440–1443.
- SEKE-2008-LeeIHZ #adaptation #design
- Design of an RSS Crawler with Adaptive Revisit Manager (BSL, JWI, BYH, DZ), pp. 219–222.
- SIGIR-2008-GuanWCBW #effectiveness #estimation #online #topic #using
- Guide focused crawler efficiently and effectively using on-line topical importance estimation (ZG, CW, CC, JB, JW), pp. 757–758.
- SIGIR-2008-WangYLCZM #traversal #web
- Exploring traversal strategy for web forum crawling (YW, JMY, WL, RC, LZ, WYM), pp. 459–466.
- SAC-2008-AssisLSG
- The impact of term selection in genre-aware focused crawling (GTdA, AHFL, ASdS, MAG), pp. 1158–1163.
- PDP-2008-MarinB #clustering #online
- Bulk-Synchronous On-Line Crawling on Clusters of Computers (MM, CB), pp. 414–421.
- VLDB-2007-ChoS #rank
- RankMass Crawler: A Crawler with High PageRank Coverage Guarantee (JC, US), pp. 375–386.
- CIKM-2007-TanMG #clustering #design #policy #web
- Designing clustering-based web crawling policies for search engine crawlers (QT, PM, CLG), pp. 535–544.
- ICML-2007-BabariaNKSBM #scalability
- Focused crawling with scalable ordinal regression solvers (RB, JSN, SK, KRS, CB, MNM), pp. 57–64.
- ASE-2007-CaiGH #modelling #performance #web
- Synthesizing client load models for performance engineering via web crawling (YC, JCG, JGH), pp. 353–362.
- HT-2006-McCownN #evaluation #policy
- Evaluation of crawling policies for a web-repository crawler (FM, MLN), pp. 157–168.
- SIGIR-2006-VidalSMC #generative
- Structure-driven crawler generation by example (MLAV, ASdS, ESdM, JMBC), pp. 292–299.
- ECDL-2005-AlmpanidisKP #semantics #using
- Focused Crawling Using Latent Semantic Indexing — An Application for Vertical Search Engines (GA, CK, IP), pp. 402–413.
- JCDL-2005-ZhuangWG #documentation #library #what
- What’s there and what’s not?: focused crawling for missing documents in digital libraries (ZZ, RW, CLG), pp. 301–310.
- CIKM-2005-TangHCG #quality #topic
- Focused crawling for both topical relevance and quality of medical information (TTT, DH, NC, KG), pp. 147–154.
- JCDL-2004-PantTJG #library #named #topic
- Panorama: extending digital libraries with topical crawlers (GP, KT, JJ, CLG), pp. 142–150.
- JCDL-2004-QinZC #library #web
- Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method (JQ, YZ, MC), pp. 135–141.
- VLDB-2004-EsterKS #performance
- Accurate and Efficient Crawling for Relevant Websites (ME, HPK, MS), pp. 396–407.
- ECDL-2003-PantM #topic
- Topical Crawling for Business Intelligence (GP, FM), pp. 233–244.
- VLDB-2003-SizovGT #framework #generative #web
- From Focused Crawling to Expert Information: an Application Framework for Web Exploration and Portal Generation (SS, JG, MT), pp. 1105–1108.
- ICML-2003-JohnsonTG #evolution #web
- Evolving Strategies for Focused Web Crawling (JJ, KT, CLG), pp. 298–305.
- SAC-2003-EhrigM #documentation #web
- Ontology-Focused Crawling of Web Documents (ME, AM), pp. 1174–1178.
- JCDL-2002-LiuMZN #named #web
- DP9: an OAI gateway service for web crawlers (XL, KM, MZ, MLN), pp. 283–284.
- CIKM-2002-ChungC #collaboration #topic
- Topic-oriented collaborative crawling (CC, CLAC), pp. 34–42.
- KDD-2002-Aggarwal02a #case study #collaboration #experience #mining #resource management #topic #user interface
- Collaborative crawling: mining user experiences for topical resource discovery (CCA), pp. 423–428.
- STOC-2002-CooperF #graph #web
- Crawling on web graphs (CC, AMF), pp. 419–427.
- JCDL-2001-Burke #library #named
- Salticus: guided crawling for personal digital libraries (RDB), pp. 88–89.
- VLDB-2001-RaghavanG #web
- Crawling the Hidden Web (SR, HGM), pp. 129–138.
- SIGIR-2001-MenczerPSR #topic #web
- Evaluating Topic-Driven Web Crawlers (FM, GP, PS, MER), pp. 241–249.
- VLDB-2000-ChoG #evolution #incremental #web
- The Evolution of the Web and Implications for an Incremental Crawler (JC, HGM), pp. 200–209.
- VLDB-2000-DiligentiCLGG #graph #using
- Focused Crawling Using Context Graphs (MD, FC, SL, CLG, MG), pp. 527–534.