38 papers:
SIGMOD-2015-LiuSWRH #corpus #mining #quality- Mining Quality Phrases from Massive Text Corpora (JL, JS, CW, XR, JH), pp. 1729–1744.
VLDB-2015-HeGC #corpus #named #semantics #using- SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora (YH, KG, XC), pp. 1358–1369.
KDD-2015-RenEWH #approach #automation #corpus #mining #network #recognition #type system- Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach (XR, AEK, CW, JH), pp. 2319–2320.
VLDB-2015-El-KishkySWVH14 #corpus #mining #scalability #topic- Scalable Topical Phrase Mining from Text Corpora (AEK, YS, CW, CRV, JH), pp. 305–316.
SCAM-2014-CaraccioloCSL #corpus #multi #named- Pangea: A Workbench for Statically Analyzing Multi-language Software Corpora (AC, AC, BS, ML), pp. 71–76.
HCI-AIMT-2014-JohnsonMV #corpus #user interface- Harmonic Navigator: An Innovative, Gesture-Driven User Interface for Exploring Harmonic Spaces in Musical Corpora (DJ, BZM, YV), pp. 58–68.
ICSM-2013-DasguptaGMDP #automation #corpus #documentation #traceability- Enhancing Software Traceability by Automatically Expanding Corpora with Relevant Documentation (TD, MG, EM, BD, DP), pp. 320–329.
ECIR-2013-RahimiS #approach #corpus #modelling- A Language Modeling Approach for Extracting Translation Knowledge from Comparable Corpora (RR, AS), pp. 606–617.
ICML-c2-2013-KimVS #approximate #corpus #modelling #topic- A Variational Approximation for Topic Modeling of Hierarchical Corpora (DkK, GMV, LKS), pp. 55–63.
SIGIR-2013-LiangR #corpus #enterprise #information management- Finding knowledgeable groups in enterprise corpora (SL, MdR), pp. 1005–1008.
SIGMOD-2012-SliwkanichSYHB #corpus #scalability #summary #towards #visualisation- Towards scalable summarization and visualization of large text corpora (abstract only) (TS, DS, AY, MH, DB), p. 863.
CIKM-2012-LiCLP #corpus #ontology #performance- Efficient extraction of ontologies from domain specific text corpora (TL, PC, LVSL, RP), pp. 1537–1541.
CIKM-2012-LiLJWZH #corpus #parallel- Joint bilingual name tagging for parallel corpora (QL, HL, HJ, WW, JZ, FH), pp. 1727–1731.
ECIR-2012-TholpadiDBS #clustering #corpus #multi #using- Cluster Labeling for Multilingual Scatter/Gather Using Comparable Corpora (GT, MKD, CB, SKS), pp. 388–400.
CIKM-2011-JacobDG #classification #corpus #multi #social #using- Classification and annotation in social corpora using multiple relations (YJ, LD, PG), pp. 1215–1220.
CIKM-2011-KimJHSZ #approach #corpus #graph #mining- Mining entity translations from comparable corpora: a holistic graph mapping approach (JK, LJ, SwH, YIS, MZ), pp. 1295–1304.
CIKM-2011-VarolCAK #corpus #detection #named- CoDet: sentence-based containment detection in news corpora (EV, FC, CA, OK), pp. 2049–2052.
SAC-2011-YelogluMZ #corpus #multi #summary- Multi-document summarization of scientific corpora (OY, EEM, ANZH), pp. 252–258.
CIKM-2010-AjmeraKLMP #corpus #parallel #web- Alignment of short length parallel corpora with an application to web search (JA, HSK, KPL, SM, MP), pp. 1477–1480.
ECIR-2010-JagarlamudiD #corpus #multi #topic- Extracting Multilingual Topics from Unaligned Comparable Corpora (JJ, HDI), pp. 444–456.
KDD-2010-ZhangSZL #corpus #correlation #multi #process- Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora (JZ, YS, CZ, SL), pp. 1079–1088.
SAC-2009-LiuMYGF #corpus #mining #probability- A sentence level probabilistic model for evolutionary theme pattern mining from news corpora (SL, YM, WGY, NG, OF), pp. 1742–1747.
CIKM-2008-UdupaSKJ #corpus #mining- Mining named entity transliteration equivalents from comparable corpora (RU, KS, AK, JJ), pp. 1423–1424.
ECIR-2008-ChoudharyMBB #corpus #evolution #interactive #towards- Towards Characterization of Actor Evolution and Interactions in News Corpora (RC, SM, AB, RB), pp. 422–429.
CIKM-2006-ShiN #adaptation #corpus #information retrieval #parallel- Filtering or adapting: two strategies to exploit noisy parallel corpora for cross-language information retrieval (LS, JYN), pp. 814–815.
ICPR-v1-2006-ZhangLSC #classification #corpus #performance- An Efficient SVM Classifier for Lopsided Corpora (XZ, BCL, WS, LC), pp. 1144–1147.
SIGIR-2006-BalogAR #corpus #enterprise #formal method #modelling- Formal models for expert finding in enterprise corpora (KB, LA, MdR), pp. 43–50.
SIGIR-2006-DiazM #corpus #estimation #modelling #scalability #using- Improving the estimation of relevance models using large external corpora (FD, DM), pp. 154–161.
KDD-2005-TaoZ #corpus #integration #mining- Mining comparable bilingual text corpora for cross-language information integration (TT, CZ), pp. 691–696.
CIKM-2004-LitaC #corpus- Unsupervised question answering data acquisition from local corpora (LVL, JGC), pp. 607–614.
SIGIR-2004-ChengTCWLC #corpus #information retrieval #query #web- Translating unknown queries with web corpora for cross-language information retrieval (PJC, JWT, RCC, JHW, WHL, LFC), pp. 146–153.
SIGIR-2003-SadatYU #automation #corpus #information retrieval- Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora (FS, MY, SU), pp. 397–398.
CIKM-2001-GhaniJM #corpus #mining #web- Mining the Web to Create Minority Language Corpora (RG, RJ, DM), pp. 279–286.
SIGIR-2001-FranzMWZ01a #corpus #parallel- Quantifying the Utility of Parallel Corpora (MF, JSM, TW, WJZ), pp. 398–399.
SIGIR-2001-GhaniJM #automation #corpus #generative #query #web- Automatic Web Search Query Generation to Create Minority Language Corpora (RG, RJ, DM), pp. 432–433.
ICML-2000-HosteDSG #corpus- Meta-Learning for Phonemic Annotation of Corpora (VH, WD, EFTKS, SG), pp. 375–382.
SIGIR-1999-Marcu #automation #corpus #research #scalability #summary- The Automatic Construction of Large-Scale Corpora for Summarization Research (DM), pp. 137–144.
SIGIR-1997-Jacquemin #corpus- Guessing Morphology from Terms and Corpora (CJ), pp. 156–165.