BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
Travelled to:
1 × Australia
1 × Brazil
1 × China
1 × Egypt
1 × Finland
1 × France
1 × Greece
1 × Portugal
1 × Singapore
1 × Switzerland
16 × USA
2 × Canada
2 × Spain
2 × The Netherlands
2 × United Kingdom
4 × Italy
Collaborated with:
P.Mitra S.Lawrence H.Han K.Williams H.Zha M.Khabsa P.Treeratpituk Z.Wu K.D.Bollacker H.Chen Y.Liu S.D.Gollapalli S.Ertekin J.Huang I.G.Councill R.Krovetz S.Tuarob Q.Tan K.Tsioutsiouliklis H.Li W.Lee W.Huang D.M.Pennock B.Sun Z.Zhuang G.W.Flake C.Caragea X.Lu J.Z.Wang S.Das L.Bolelli Y.Sun Y.Song K.Bai S.Ugurel Y.Petinot P.B.Teregowda C.W.Omlin D.Yuan S.Zheng E.Manavoglu J.Johnson Z.W.0002 J.Wu D.Zhou D.Lee F.Coetzee S.Debnath E.A.Fox L.Gou J.Wen X.Zhang J.W.0006 S.R.Choudhury C.Liang E.J.Glover F.Å.Nielsen A.Sivasubramaniam S.Kataria M.Gori V.Bhatnagar J.Li P.Dmitriev O.Madani L.Zhang Y.Jo C.Lagoze R.Wagle L.C.Pouchard L.N.Cassel J.Lee M.A.Pérez-Quiñones D.Knox J.Impagliazzo A.Rangaswamy N.Pal A.Kruger H.Yu L.Chen S.Bhatia X.(.Zhang R.Song Y.Qu P.Song L.Bottou K.Bai X.Ji W.Xu S.Park P.M.II E.Horvitz W.P.Birmingham G.Pant S.Yan M.Kan E.Elmacioglu B.Kahle A.Silvescu A.Ororbia S.Wang B.Pursel B.Bräutigam S.Saul H.Williams K.Bowen Z.Li Z.Nie J.Yen V.Petricek I.J.Cox M.Diligenti M.F.Sakr S.P.Levitan D.M.Chiarulli B.G.Horne A.Popescul L.H.Ungar C.Li W.Browuer J.Kim A.Chiatti L.Rokach Q.He B.Chen J.Pei B.Qiu L.V.Lita R.S.Niculescu E.D.Iorio M.Maggini A.Pucci X.Zhang Z.Zhang N.Li L.Zhu K.T.Mueller E.Poweleit A.Kirk S.Szep D.Pellegrino S.Jones Q.Zhao X.Yang D.He Z.Zhou D.Kifer A.M.Ciobanu J.P.F.Ramírez
Talks about:
digit (22) librari (21) search (20) document (19) extract (17) use (15) automat (13) citat (13) topic (11) web (11)

Person: C. Lee Giles

DBLP DBLP: Giles:C=_Lee

Facilitated 1 volumes:

JCDL 2010Ed

Contributed to:

DocEng 20152015
SIGMOD 20152015
DocEng 20142014
ECIR 20142014
JCDL 20142014
CIKM 20132013
DocEng 20132013
ICDAR 20132013
VLDB 20132013
CIKM 20122012
ECIR 20122012
SAC 20122012
CIKM 20092009
ECIR 20092009
ICDAR 20092009
CHI 20082008
CIKM 20082008
SIGIR 20082008
CIKM 20072007
ECIR 20072007
ICDAR 20072007
KDD 20072007
SIGIR 20072007
CIKM 20062006
ECDL 20062006
ECDL 20052005
SAC 20052005
CIKM 20042004
ITiCSE 20042004
ICML 20032003
ITiCSE 20032003
SIGIR 20032003
KDD 20022002
SIGIR 20022002
KDD 20012001
CIKM 20002000
ICML 20002000
KDD 20002000
VLDB 20002000
CIKM 19991999
ICML 19971997
ML 19921992
DL 19981998
DL 19991999
ADL 20002000
JCDL 20032003
JCDL 20042004
JCDL 20052005
JCDL 20062006
JCDL 20072007
JCDL 20082008
JCDL 20092009
JCDL 20102010
JCDL 20112011
JCDL 20122012
JCDL 20132013
JCDL 20152015
JCDL 20162016
JCDL 20172017

Wrote 109 papers:

DocEng-2015-LiangWWWPBSWBG #automation #framework #named
BBookX: An Automatic Book Creation Framework (CL, SW, ZW, KW, BP, BB, SS, HW, KB, CLG), pp. 121–124.
DocEng-2015-WangLWWPBSWBG #concept
Concept Hierarchy Extraction from Textbooks (SW, CL, ZW, KW, BP, BB, SS, HW, KB, CLG), pp. 147–156.
SIGMOD-2015-YuanMYG #algorithm #graph
Updating Graph Indices with a One-Pass Algorithm (DY, PM, HY, CLG), pp. 1903–1916.
DocEng-2014-WilliamsCG #ranking
Classifying and ranking search engine results as potential sources of plagiarism (KW, HHC, CLG), pp. 97–106.
DocEng-2014-WilliamsWG #documentation #named
SimSeerX: a similar document search engine (KW, JW, CLG), pp. 143–146.
ECIR-2014-CarageaWCWRCWG #big data #dataset
CiteSeer x : A Scholarly Big Dataset (CC, JW, AMC, KW, JPFR, HHC, ZW, CLG), pp. 311–322.
JCDL-2014-ChenKG #library #metadata #scalability
The feasibility of investing in manual correction of metadata for a large-scale digital library (HHC, MK, CLG), pp. 225–228.
JCDL-2014-HuangWMG #named #recommendation
RefSeer: A citation recommendation system (WH, ZW, PM, CLG), pp. 371–374.
JCDL-2014-WuHCG #metadata #web
Crowd-sourcing Web knowledge for metadata extraction (ZW, WH, LC, CLG), pp. 141–144.
JCDL-2014-WuWKWCHTCOMG #big data #challenge #framework #platform #towards
Towards building a scholarly big data platform: Challenges, lessons and opportunities (ZW, JW, MK, KW, HHC, WH, ST, SRC, AO, PM, CLG), pp. 117–126.
CIKM-2013-Giles #big data #data mining #information management #mining
Scholarly big data: information extraction and data mining (CLG), pp. 1–2.
DocEng-2013-WilliamsG #detection #library
Near duplicate detection in an academic digital library (KW, CLG), pp. 91–94.
DocEng-2013-WuDLMG #documentation #online
Searching online book documents and analyzing book citations (ZW, SD, ZL, PM, CLG), pp. 81–90.
ICDAR-2013-ChoudhuryMKSPJG #documentation #metadata
Figure Metadata Extraction from Digital Documents (SRC, PM, AK, SS, DP, SJ, CLG), pp. 135–139.
ICDAR-2013-TuarobBMG #automation #detection #documentation #machine learning #pseudo #using
Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning (ST, SB, PM, CLG), pp. 738–742.
ICDAR-2013-WuMG #documentation #recognition
Table of Contents Recognition and Extraction for Heterogeneous Book Documents (ZW, PM, CLG), pp. 1205–1209.
VLDB-2013-YuanMG #graph #mining
Mining and Indexing Graphs for Supergraph Search (DY, PM, CLG), pp. 829–840.
CIKM-2012-HuangKCMGR #recommendation
Recommending citations: translating papers into references (WH, SK, CC, PM, CLG, LR), pp. 1910–1914.
CIKM-2012-KhabsaTG #using
Entity resolution using search engine results (MK, PT, CLG), pp. 2363–2366.
ECIR-2012-DasMG #classification #identification #topic
Phrase Pair Classification for Identifying Subtopics (SD, PM, CLG), pp. 489–493.
SAC-2012-ChenGZG #metric #network #similarity #using
Discovering missing links in networks using vertex similarity measures (HHC, LG, X(Z, CLG), pp. 138–143.
CIKM-2009-HeCPQMG #detection #evolution #how #question #topic
Detecting topic evolution in scientific literature: how can citations help? (QH, BC, JP, BQ, PM, CLG), pp. 957–966.
CIKM-2009-SunMG #graph #independence #information retrieval #mining
Independent informative subgraph mining for graph information retrieval (BS, PM, CLG), pp. 563–572.
CIKM-2009-SunMG09a #graph #learning #online #rank
Learning to rank graphs for online similar graph search (BS, PM, CLG), pp. 1871–1874.
CIKM-2009-ZhengDG #crawling #graph
Graph-based seed selection for web-scale crawlers (SZ, PD, CLG), pp. 1967–1970.
CIKM-2009-ZhengSWG #induction #performance
Efficient record-level wrapper induction (SZ, RS, JRW, CLG), pp. 47–56.
ECIR-2009-BolelliEG #detection #topic #using
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation (LB, SE, CLG), pp. 776–780.
ECIR-2009-TanMG #documentation #effectiveness #web
Effectively Searching Maps in Web Documents (QT, PM, CLG), pp. 162–176.
ICDAR-2009-LiuBMG #bound #detection #fault #sequence
Improving the Table Boundary Detection in PDFs by Fixing the Sequence Error of the Sparse Lines (YL, KB, PM, CLG), pp. 1006–1010.
CHI-2008-ZhangQGS #named #research
CiteSense: supporting sensemaking of research literature (XZ, YQ, CLG, PS), pp. 677–680.
CIKM-2008-HuangMG #categorisation #framework #multi
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization (JH, OM, CLG), pp. 83–92.
CIKM-2008-LiNLGW #community #scalability
Scalable community discovery on textual data with relations (HL, ZN, WCL, CLG, JRW), pp. 1203–1212.
CIKM-2008-LiuLNBMG #dataset #feature model #performance #preprocessor #realtime #scalability
Real-time data pre-processing technique for efficient feature extraction in large scale datasets (YL, LVL, RSN, KB, PM, CLG), pp. 981–990.
CIKM-2008-LiuMG #bound #detection #documentation #identification
Identifying table boundaries in digital documents via sparse line detection (YL, PM, CLG), pp. 1311–1320.
CIKM-2008-SongZG #classification #framework #performance #process
A sparse gaussian processes classification framework for fast tag suggestions (YS, LZ, CLG), pp. 93–102.
CIKM-2008-SunLCLG #library
Measuring user preference changes in digital libraries (YS, HL, IGC, WCL, CLG), pp. 1497–1498.
CIKM-2008-TanMG #documentation #metadata #web
Metadata extraction and indexing for map search in web documents (QT, PM, CLG), pp. 1367–1368.
SIGIR-2008-SongZLZLLG #automation #realtime #recommendation
Real-time automatic tag recommendation (YS, ZZ, HL, QZ, JL, WCL, CLG), pp. 515–522.
CIKM-2007-ErtekinHBG #classification #learning
Learning on the border: active learning in imbalanced data classification (SE, JH, LB, CLG), pp. 127–136.
CIKM-2007-TanMG #clustering #crawling #design #policy #web
Designing clustering-based web crawling policies for search engine crawlers (QT, PM, CLG), pp. 535–544.
ECIR-2007-SunG #library #ranking
Popularity Weighted Ranking for Academic Digital Libraries (YS, CLG), pp. 605–612.
ICDAR-2007-LiuBMG #documentation
Searching for Tables in Digital Documents (YL, KB, PM, CLG), pp. 934–938.
ICDAR-2007-LuWMG #2d #automation #documentation
Automatic Extraction of Data from 2-D Plots in Documents (XL, JZW, PM, CLG), pp. 188–192.
KDD-2007-JoLG #correlation #detection #graph #research #topic
Detecting research topics via the correlation between graphs and texts (YJ, CL, CLG), pp. 370–379.
SIGIR-2007-ErtekinHG #learning #problem
Active learning for class imbalance problem (SE, JH, CLG), pp. 823–824.
SIGIR-2007-SunMGYZ #detection #documentation #multi #segmentation #topic
Topic segmentation with shared topic detection and alignment of multiple documents (BS, PM, CLG, JY, HZ), pp. 199–206.
CIKM-2006-ZhouJZG #evolution #how #interactive #research #social #topic
Topic evolution and social interactions: how authors effect research (DZ, XJ, HZ, CLG), pp. 248–257.
ECDL-2006-CouncillGIGMP #architecture #deployment #flexibility #generative #library #towards
Towards Next Generation CiteSeer: A Flexible Architecture for Digital Library Deployment (IGC, CLG, EDI, MG, MM, AP), pp. 111–122.
ECDL-2005-PetricekCHCG #comparison #database #online
A Comparison of On-Line Computer Science Citation Databases (VP, IJC, HH, IGC, CLG), pp. 438–449.
SAC-2005-DebnathMG #automation
Automatic extraction of informative blocks from webpages (SD, PM, CLG), pp. 1722–1726.
SAC-2005-HanMZTGZ #clustering #documentation #metadata #rule-based #word
Rule-based word clustering for document metadata extraction (HH, EM, HZ, KT, CLG, XZ), pp. 1049–1053.
SAC-2005-HanXZG #ambiguity #naive bayes
A hierarchical naive Bayes mixture model for name disambiguation in author citations (HH, WX, HZ, CLG), pp. 1065–1069.
CIKM-2004-PetinotGBTHC #library #named #towards
CiteSeer-API: towards seamless resource location and interlinking for digital libraries (YP, CLG, VB, PBT, HH, IGC), pp. 553–561.
ITiCSE-2004-CasselFLPKIG #using
Using CITIDEL to develop and share class plans (LNC, EAF, JL, MAPQ, DK, JI, CLG), p. 270.
ICML-2003-JohnsonTG #crawling #evolution #web
Evolving Strategies for Focused Web Crawling (JJ, KT, CLG), pp. 298–305.
ITiCSE-2003-CasselIKGFLP #education #library #using
Using an education oriented digital library to organize and present classes in computing and information (LNC, JI, DK, CLG, EAF, JL, MAPQ), p. 260.
SIGIR-2003-GilesPTHLRP #named
eBizSearch: a niche search engine for e-business (CLG, YP, PBT, HH, SL, AR, NP), pp. 413–414.
SIGIR-2003-HanMGZ #classification #clustering #rule-based #word
Rule-based word clustering for text classification (HH, EM, CLG, HZ), pp. 445–446.
SIGIR-2003-KrovetzUG #classification #source code
Classification of source code archives (RK, SU, CLG), pp. 425–426.
KDD-2002-UgurelKG #automation #classification #source code #what
What’s the code?: automatic classification of source code archives (SU, RK, CLG), pp. 639–644.
SIGIR-2002-ParkPGK #analysis #documentation
Analysis of lexical signatures for finding lost or related documents (STP, DMP, CLG, RK), pp. 11–18.
KDD-2001-PennockLNG #game studies #probability #web
Extracting collective probabilistic forecasts from web games (DMP, SL, FÅN, CLG), pp. 174–183.
CIKM-2000-KrugerGCGFLO #named
DEADLINER: Building a New Niche Search Engine (AK, CLG, FC, EJG, GWF, SL, CWO), pp. 272–281.
CIKM-2000-LawrenceCFPKNKG #persistent #research #web
Persistence of information on the web: Analyzing citations contained in research articles (SL, FC, GWF, DMP, RK, FÅN, AK, CLG), pp. 235–242.
ICML-2000-PennockMGH #algorithm #learning
A Normative Examination of Ensemble Learning Algorithms (DMP, PMRI, CLG, EH), pp. 735–742.
KDD-2000-FlakeLG #community #identification #performance #web
Efficient identification of Web communities (GWF, SL, CLG), pp. 150–160.
VLDB-2000-DiligentiCLGG #crawling #graph #using
Focused Crawling Using Context Graphs (MD, FC, SL, CLG, MG), pp. 527–534.
CIKM-1999-Giles #question #web #what
Searching the Web: Can You Find What You Want? (CLG), pp. 1–2.
CIKM-1999-GloverLBG #architecture
Architecture of a Metasearch Engine That Supports User Information Needs (EJG, SL, WPB, CLG), pp. 210–216.
CIKM-1999-LawrenceBG #retrieval
Indexing and Retrieval of Scientific Literature (SL, KDB, CLG), pp. 139–146.
ICML-1997-SakrLCHG #data access #learning #memory management #modelling #multi #predict
Predicting Multiprocessor Memory Access Patterns with Learning Models (MFS, SPL, DMC, BGH, CLG), pp. 305–312.
ML-1992-OmlinG #higher-order #network #using
Training Second-Order Recurrent Neural Networks using Hints (CWO, CLG), pp. 361–366.
DL-1998-GilesBL #automation #named
CiteSeer: An Automatic Citation Indexing System (CLG, KDB, SL), pp. 89–98.
DL-1999-BollackerLG #automation #personalisation #web
A System for Automatic Personalized Tracking of Scientific Literature on the Web (KDB, SL, CLG), pp. 105–113.
DL-1999-LawrenceBG #distributed #fault
Distributed Error Correction (SL, KDB, CLG), p. 232.
ADL-2000-PopesculFLUG #clustering #database #documentation #identification #roadmap
Clustering and Identifying Temporal Trends in Document Databases (AP, GWF, SL, LHU, CLG), pp. 173–182.
JCDL-2003-HanGMZZF #automation #documentation #metadata #using
Automatic Document Metadata Extraction Using Support Vector Machines (HH, CLG, EM, HZ, ZZ, EAF), pp. 37–48.
JCDL-2003-PetinotTHGLRP #library #named
eBizSearch: An OAI-Compliant Digital Library for eBusiness (YP, PBT, HH, CLG, SL, AR, NP), pp. 199–209.
JCDL-2004-HanGZLT #ambiguity #learning
Two supervised learning approaches for name disambiguation in author citations (HH, CLG, HZ, CL, KT), pp. 296–305.
JCDL-2004-PantTJG #crawling #library #named #topic
Panorama: extending digital libraries with topical crawlers (GP, KT, JJ, CLG), pp. 142–150.
JCDL-2004-PetinotGBTH #api #library
Enabling interoperability for autonomous digital libraries: an API to citeseer services (YP, CLG, VB, PBT, HH), pp. 372–373.
JCDL-2005-HanZG #ambiguity #clustering #using
Name disambiguation in author citations using a K-way spectral clustering method (HH, HZ, CLG), pp. 334–343.
JCDL-2005-ZhuangWG #crawling #documentation #library #what
What’s there and what’s not?: focused crawling for missing documents in digital libraries (ZZ, RW, CLG), pp. 301–310.
JCDL-2006-CouncillLZDBLSG #learning #metadata #online
Learning metadata from the evidence in an on-line citation matching scheme (IGC, HL, ZZ, SD, LB, WCL, AS, CLG), pp. 276–285.
JCDL-2006-LuMWG #automation #categorisation #documentation
Automatic categorization of figures in scientific documents (XL, PM, JZW, CLG), pp. 129–138.
JCDL-2007-LiLSG #generative #library #named
SearchGen: a synthetic workload generator for scientific literature digital libraries and search engines (HL, WCL, AS, CLG), pp. 137–146.
JCDL-2007-LiuBMG #automation #library #metadata #named
TableSeer: automatic table metadata extraction and searching in digital libraries (YL, KB, PM, CLG), pp. 91–100.
JCDL-2007-SongHCLG #ambiguity #performance #topic
Efficient topic-based unsupervised name disambiguation (YS, JH, IGC, JL, CLG), pp. 342–351.
JCDL-2007-YanLKG #adaptation #performance
Adaptive sorted neighborhood methods for efficient record linkage (SY, DL, MYK, CLG), pp. 185–194.
JCDL-2007-ZhuangELG #mining #quality
Measuring conference quality by mining program committee characteristics (ZZ, EE, DL, CLG), pp. 225–234.
JCDL-2008-BrowuerKDMG #2d
Segregating and extracting overlapping data points in two-dimensional plots (WB, SK, SD, PM, CLG), pp. 276–279.
JCDL-2008-LuKWG #generative #metadata
A metadata generation system for scanned scientific volumes (XL, BK, JZW, CLG), pp. 167–176.
JCDL-2009-BolelliEZG #library #roadmap #topic
Finding topic trends in digital libraries (LB, SE, DZ, CLG), pp. 69–72.
JCDL-2009-TreeratpitukG #random #using
Disambiguating authors in academic publications using random forests (PT, CLG), pp. 39–48.
JCDL-2010-GouZCKG #documentation #network #ranking #social
Social network document ranking (LG, XZ, HHC, JHK, CLG), pp. 313–322.
JCDL-2010-LiZMMPG #library #semantics
oreChem ChemXSeer: a semantic digital library for chemistry (NL, LZ, PM, KTM, EP, CLG), pp. 245–254.
JCDL-2011-GollapalliGMC #identification #library #on the
On identifying academic homepages for digital libraries (SDG, CLG, PM, CC), pp. 123–132.
JCDL-2011-GollapalliMG #library #ranking
Ranking authors in digital libraries (SDG, PM, CLG), pp. 251–254.
JCDL-2012-GollapalliMG #research
Similar researcher search in academic environments (SDG, PM, CLG), pp. 167–170.
JCDL-2012-KhabsaTG #automation #library #named #repository
AckSeer: a repository and search engine for automatically extracted acknowledgments from digital libraries (MK, PT, CLG), pp. 185–194.
JCDL-2012-TuarobMG #algorithm #network #using
Improving algorithm search using the algorithm co-citation network (ST, PM, CLG), pp. 277–280.
JCDL-2013-CarageaSMG #recommendation
Can’t see the forest for the trees?: a citation recommendation system (CC, AS, PM, CLG), pp. 111–114.
JCDL-2013-GollapalliMG #graph #ranking #topic #using
Ranking experts using author-document-topic graphs (SDG, PM, CLG), pp. 87–96.
JCDL-2013-TuarobPG #automation #metadata #modelling #probability #recommendation #topic #using
Automatic tag recommendation for metadata annotation using probabilistic topic modeling (ST, LCP, CLG), pp. 239–248.
JCDL-2015-KhabsaTG #ambiguity #constraints #online
Online Person Name Disambiguation with Constraints (MK, PT, CLG), pp. 37–46.
JCDL-2016-KhabsaWG #comprehension #towards
Towards Better Understanding of Academic Search (MK, ZW0, CLG), pp. 111–114.
JCDL-2016-WilliamsWWG #information management #library
Information Extraction for Scholarly Digital Libraries (KW, JW0, ZW0, CLG), pp. 287–288.
JCDL-2017-WuCCLG #approach #hybrid #named
HESDK: A Hybrid Approach to Extracting Scientific Domain Knowledge Entities (JW0, SRC, AC, CL, CLG), pp. 241–244.
JCDL-2017-YangHHOZKG #identification #learning #library #using
Smart Library: Identifying Books on Library Shelves Using Supervised Deep Learning for Scene Text Reading (XY, DH, WH, AO, ZZ, DK, CLG), pp. 245–248.

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.