BibSLEIGH
BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
CC-BY
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
Used together with:
mine (8)
languag (8)
compar (7)
text (6)
model (5)

Stem corpora$ (all stems)

38 papers:

SIGMODSIGMOD-2015-LiuSWRH #corpus #mining #quality
Mining Quality Phrases from Massive Text Corpora (JL, JS, CW, XR, JH), pp. 1729–1744.
VLDBVLDB-2015-HeGC #corpus #named #semantics #using
SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora (YH, KG, XC), pp. 1358–1369.
KDDKDD-2015-RenEWH #approach #automation #corpus #mining #network #recognition #type system
Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach (XR, AEK, CW, JH), pp. 2319–2320.
VLDBVLDB-2015-El-KishkySWVH14 #corpus #mining #scalability #topic
Scalable Topical Phrase Mining from Text Corpora (AEK, YS, CW, CRV, JH), pp. 305–316.
SCAMSCAM-2014-CaraccioloCSL #corpus #multi #named
Pangea: A Workbench for Statically Analyzing Multi-language Software Corpora (AC, AC, BS, ML), pp. 71–76.
HCIHCI-AIMT-2014-JohnsonMV #corpus #user interface
Harmonic Navigator: An Innovative, Gesture-Driven User Interface for Exploring Harmonic Spaces in Musical Corpora (DJ, BZM, YV), pp. 58–68.
ICSMEICSM-2013-DasguptaGMDP #automation #corpus #documentation #traceability
Enhancing Software Traceability by Automatically Expanding Corpora with Relevant Documentation (TD, MG, EM, BD, DP), pp. 320–329.
ECIRECIR-2013-RahimiS #approach #corpus #modelling
A Language Modeling Approach for Extracting Translation Knowledge from Comparable Corpora (RR, AS), pp. 606–617.
ICMLICML-c2-2013-KimVS #approximate #corpus #modelling #topic
A Variational Approximation for Topic Modeling of Hierarchical Corpora (DkK, GMV, LKS), pp. 55–63.
SIGIRSIGIR-2013-LiangR #corpus #enterprise #information management
Finding knowledgeable groups in enterprise corpora (SL, MdR), pp. 1005–1008.
SIGMODSIGMOD-2012-SliwkanichSYHB #corpus #scalability #summary #towards #visualisation
Towards scalable summarization and visualization of large text corpora (abstract only) (TS, DS, AY, MH, DB), p. 863.
CIKMCIKM-2012-LiCLP #corpus #ontology #performance
Efficient extraction of ontologies from domain specific text corpora (TL, PC, LVSL, RP), pp. 1537–1541.
CIKMCIKM-2012-LiLJWZH #corpus #parallel
Joint bilingual name tagging for parallel corpora (QL, HL, HJ, WW, JZ, FH), pp. 1727–1731.
ECIRECIR-2012-TholpadiDBS #clustering #corpus #multi #using
Cluster Labeling for Multilingual Scatter/Gather Using Comparable Corpora (GT, MKD, CB, SKS), pp. 388–400.
CIKMCIKM-2011-JacobDG #classification #corpus #multi #social #using
Classification and annotation in social corpora using multiple relations (YJ, LD, PG), pp. 1215–1220.
CIKMCIKM-2011-KimJHSZ #approach #corpus #graph #mining
Mining entity translations from comparable corpora: a holistic graph mapping approach (JK, LJ, SwH, YIS, MZ), pp. 1295–1304.
CIKMCIKM-2011-VarolCAK #corpus #detection #named
CoDet: sentence-based containment detection in news corpora (EV, FC, CA, OK), pp. 2049–2052.
SACSAC-2011-YelogluMZ #corpus #multi #summary
Multi-document summarization of scientific corpora (OY, EEM, ANZH), pp. 252–258.
CIKMCIKM-2010-AjmeraKLMP #corpus #parallel #web
Alignment of short length parallel corpora with an application to web search (JA, HSK, KPL, SM, MP), pp. 1477–1480.
ECIRECIR-2010-JagarlamudiD #corpus #multi #topic
Extracting Multilingual Topics from Unaligned Comparable Corpora (JJ, HDI), pp. 444–456.
KDDKDD-2010-ZhangSZL #corpus #correlation #multi #process
Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora (JZ, YS, CZ, SL), pp. 1079–1088.
SACSAC-2009-LiuMYGF #corpus #mining #probability
A sentence level probabilistic model for evolutionary theme pattern mining from news corpora (SL, YM, WGY, NG, OF), pp. 1742–1747.
CIKMCIKM-2008-UdupaSKJ #corpus #mining
Mining named entity transliteration equivalents from comparable corpora (RU, KS, AK, JJ), pp. 1423–1424.
ECIRECIR-2008-ChoudharyMBB #corpus #evolution #interactive #towards
Towards Characterization of Actor Evolution and Interactions in News Corpora (RC, SM, AB, RB), pp. 422–429.
CIKMCIKM-2006-ShiN #adaptation #corpus #information retrieval #parallel
Filtering or adapting: two strategies to exploit noisy parallel corpora for cross-language information retrieval (LS, JYN), pp. 814–815.
ICPRICPR-v1-2006-ZhangLSC #classification #corpus #performance
An Efficient SVM Classifier for Lopsided Corpora (XZ, BCL, WS, LC), pp. 1144–1147.
SIGIRSIGIR-2006-BalogAR #corpus #enterprise #formal method #modelling
Formal models for expert finding in enterprise corpora (KB, LA, MdR), pp. 43–50.
SIGIRSIGIR-2006-DiazM #corpus #estimation #modelling #scalability #using
Improving the estimation of relevance models using large external corpora (FD, DM), pp. 154–161.
KDDKDD-2005-TaoZ #corpus #integration #mining
Mining comparable bilingual text corpora for cross-language information integration (TT, CZ), pp. 691–696.
CIKMCIKM-2004-LitaC #corpus
Unsupervised question answering data acquisition from local corpora (LVL, JGC), pp. 607–614.
SIGIRSIGIR-2004-ChengTCWLC #corpus #information retrieval #query #web
Translating unknown queries with web corpora for cross-language information retrieval (PJC, JWT, RCC, JHW, WHL, LFC), pp. 146–153.
SIGIRSIGIR-2003-SadatYU #automation #corpus #information retrieval
Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora (FS, MY, SU), pp. 397–398.
CIKMCIKM-2001-GhaniJM #corpus #mining #web
Mining the Web to Create Minority Language Corpora (RG, RJ, DM), pp. 279–286.
SIGIRSIGIR-2001-FranzMWZ01a #corpus #parallel
Quantifying the Utility of Parallel Corpora (MF, JSM, TW, WJZ), pp. 398–399.
SIGIRSIGIR-2001-GhaniJM #automation #corpus #generative #query #web
Automatic Web Search Query Generation to Create Minority Language Corpora (RG, RJ, DM), pp. 432–433.
ICMLICML-2000-HosteDSG #corpus
Meta-Learning for Phonemic Annotation of Corpora (VH, WD, EFTKS, SG), pp. 375–382.
SIGIRSIGIR-1999-Marcu #automation #corpus #research #scalability #summary
The Automatic Construction of Large-Scale Corpora for Summarization Research (DM), pp. 137–144.
SIGIRSIGIR-1997-Jacquemin #corpus
Guessing Morphology from Terms and Corpora (CJ), pp. 156–165.

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.