118 papers:
SANER-2015-AggarwalRTHGS #debugging #detection #re-engineering- Detecting duplicate bug reports with software engineering domain knowledge (KA, TR, FT, AH, RG, ES), pp. 211–220.
SCAM-2015-BoisselleA #debugging #empirical- The impact of cross-distribution bug duplicates, empirical study on Debian and Ubuntu (VB, BA), pp. 131–140.
MoDELS-2015-RagoMD #case study #identification #semantics- Identifying duplicate functionality in textual use cases by aligning semantic actions (SoSyM abstract) (AR, CM, JADP), p. 446.
SAC-2015-BezuBRVVF #component #detection #multi #similarity #web- Multi-component similarity method for web product duplicate detection (RvB, SB, RR, JV, DV, FF), pp. 761–768.
ASE-2014-ThungKL #debugging #detection #named #tool support- DupFinder: integrated tool support for duplicate bug report detection (FT, PSK, DL), pp. 871–874.
MSR-2014-KleinCK #debugging #detection- New features for duplicate bug detection (NK, CSC, NAK), pp. 324–327.
MSR-2014-LazarRS #debugging #detection #metric #similarity #using- Improving the accuracy of duplicate bug report detection using textual similarity measures (AL, SR, BS), pp. 308–311.
MSR-2014-LazarRS14a #dataset #debugging #generative- Generating duplicate bug datasets (AL, SR, BS), pp. 392–395.
CIAA-2014-DumitranGMM #bound- Bounded Prefix-Suffix Duplication (MD, JG, FM, VM), pp. 176–187.
CIKM-2014-HeiseKN #clustering- Estimating the Number and Sizes of Fuzzy-Duplicate Clusters (AH, GK, FN), pp. 959–968.
ICPR-2014-LiuLS #documentation #image #novel- Novel Global and Local Features for Near-Duplicate Document Image Matching (LL, YL, CYS), pp. 4624–4629.
ICPR-2014-NegrelPG #image #learning #metric #performance #reduction #retrieval #using- Efficient Metric Learning Based Dimension Reduction Using Sparse Projectors for Image Near Duplicate Retrieval (RN, DP, PHG), pp. 738–743.
ICPR-2014-YangLLZ #consistency #geometry #image #rank- Low Rank Global Geometric Consistency for Partial-Duplicate Image Search (LY, YL, ZL, HZ), pp. 3939–3944.
SIGIR-2014-BaruahRS- The effect of expanding relevance judgements with duplicates (GB, AR, MDS), pp. 1159–1162.
DocEng-2013-WilliamsG #detection #library- Near duplicate detection in an academic digital library (KW, CLG), pp. 91–94.
ICDAR-2013-LiuLSX #documentation #image #modelling #retrieval #word- Modeling Local Word Spatial Configurations for Near Duplicate Document Image Retrieval (LL, YL, CYS, JX), pp. 235–239.
VLDB-2013-DuttaNB #approach #approximate #data type #detection #streaming- Streaming Quotient Filter: A Near Optimal Approximate Duplicate Detection Approach for Data Streams (SD, AN, SKB), pp. 589–600.
CSMR-2013-LerchM #debugging- Finding Duplicates of Your Yet Unwritten Bug Report (JL, MM), pp. 69–78.
MSR-2013-AlipourHS #approach #debugging #detection #towards- A contextual approach towards more accurate duplicate bug report detection (AA, AH, ES), pp. 183–192.
MSR-2013-AmouiKATLL #detection #experience #fault #industrial #search-based- Search-based duplicate defect detection: an industrial experience (MA, NK, AAD, LT, SL, WL), pp. 173–182.
LATA-2013-BenzaidDE #algorithm #complexity- Duplication-Loss Genome Alignment: Complexity and Algorithm (BB, RD, NEM), pp. 116–127.
CAiSE-2013-BakkerFV #approach #detection #hybrid #web- A Hybrid Model Words-Driven Approach for Web Product Duplicate Detection (MdB, FF, DV), pp. 149–161.
ECIR-2013-IgnatovKC #approach #detection- Near-Duplicate Detection for Online-Shops Owners: An FCA-Based Approach (DII, AVK, YC), pp. 722–725.
ICML-c3-2013-FriedlandJL #detection #social- Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events (LF, DJ, ML), pp. 1175–1183.
SAC-2013-BakkerVFK #detection #web- Model words-driven approaches for duplicate detection on the web (MdB, DV, FF, UK), pp. 717–723.
SAC-2013-LeitaoC #adaptation #detection #optimisation #performance #using #xml- Efficient XML duplicate detection using an adaptive two-level optimization (LL, PC), pp. 832–837.
ASE-2012-NguyenNNLS #debugging #detection #information retrieval #modelling #topic- Duplicate bug report detection with a combination of information retrieval and topic modeling (ATN, TTN, TNN, DL, CS), pp. 70–79.
CSMR-2012-KaushikT #case study #comparative #debugging #detection #information retrieval #modelling #performance- A Comparative Study of the Performance of IR Models on Duplicate Bug Detection (NK, LT), pp. 159–168.
CSMR-2012-TianSL #debugging #identification- Improved Duplicate Bug Report Identification (YT, CS, DL), pp. 385–390.
CIKM-2012-SarmaJMB #automation #scalability- An automatic blocking mechanism for large-scale de-duplication tasks (ADS, AJ, AM, PB), pp. 1055–1064.
CIKM-2012-ZhouZ #debugging #learning #rank- Learning to rank duplicate bug reports (JZ, HZ), pp. 852–861.
ICPR-2012-VitaladevuniCPN #detection #documentation #image #using- Detecting near-duplicate document images using interest point matching (SNPV, FC, RP, PN), pp. 347–350.
ICSE-2012-DangWZZN #clustering #named #similarity #stack- ReBucket: A method for clustering duplicate crash reports based on call stack similarity (YD, RW, HZ, DZ, PN), pp. 1084–1093.
ASE-2011-SunLKJ #debugging #retrieval #towards- Towards more accurate retrieval of duplicate bug reports (CS, DL, SCK, JJ), pp. 253–262.
PODS-2011-JowhariST #bound #problem- Tight bounds for Lp samplers, finding duplicates in streams, and related problems (HJ, MS, GT), pp. 49–58.
AFL-2011-Ito #strict- K-Restricted Duplication Closure of Languages (MI), pp. 28–33.
CIKM-2011-KoloniariNPS #distributed- One is enough: distributed filtering for duplicate elimination (GK, NN, EP, DS), pp. 433–442.
CIKM-2011-LangeN #metric #similarity #why- Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate (DL, FN), pp. 243–248.
CIKM-2011-LeitaoC #detection #optimisation- Duplicate detection through structure optimization (LL, PC), pp. 443–452.
CIKM-2011-SoodL #detection #probability #using- Probabilistic near-duplicate detection using simhash (SS, DL), pp. 1117–1126.
CIKM-2011-YalnizCM #detection #scalability- Partial duplicate detection for large book collections (IZY, EFC, RM), pp. 469–474.
SAC-2011-WuW #framework- A data de-duplication access framework for solid state drives (CHW, HSW), pp. 600–604.
SIGMOD-2010-WangWLWWLTXL #dataset #detection #named- MapDupReducer: detecting near duplicates over massive datasets (CW, JW, XL, WW, HW, HL, WT, JX, RL), pp. 1119–1122.
CSMR-2010-CavalcantiACLM #debugging #problem- An Initial Study on the Bug Report Duplication Problem (YCC, ESdA, CEAdC, DL, SRdLM), pp. 264–267.
CAiSE-2010-HordijkW #case study- Rationality of Cross-System Data Duplication: A Case Study (WH, RW), pp. 68–82.
ECIR-2010-KimCLL #detection #effectiveness #image #named #sequence #using- BASIL: Effective Near-Duplicate Image Detection Using Gene Sequence Alignment (HsK, HWC, JL, DL), pp. 229–240.
ICPR-2010-BalujaC #learning #performance #retrieval- Beyond “Near Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval (SB, MC), pp. 543–547.
ICPR-2010-HarmanciH #adaptation #image #query- Content Adaptive Hash Lookups for Near-Duplicate Image Search by Full or Partial Image Queries (OH, IH), pp. 1582–1585.
ICPR-2010-IdeSDTM #classification #video- Classification of Near-Duplicate Video Segments Based on Their Appearance Patterns (II, YS, DD, TT, HM), pp. 3129–3133.
ICPR-2010-WuXJHCL #constraints #geometry #image #invariant #retrieval- Adding Affine Invariant Geometric Constraint for Partial-Duplicate Image Retrieval (ZW, QX, SJ, QH, PC, LL), pp. 842–845.
SIGIR-2010-HajishirziYK #adaptation #detection #learning #similarity- Adaptive near-duplicate detection via similarity learning (HH, WtY, AK), pp. 419–426.
SIGIR-2010-ZhangZYH #detection #performance #sequence- Efficient partial-duplicate detection based on sequence matching (QZ, YZ, HY, XH), pp. 675–682.
ICSE-2010-SongWXZM #debugging #detection #named- JDF: detecting duplicate bug reports in Jazz (YS, XW, TX, LZ, HM), pp. 315–316.
ICSE-2010-SunLWJK #approach #debugging #retrieval- A discriminative model approach for accurate duplicate bug report retrieval (CS, DL, XW, JJ, SCK), pp. 45–54.
ICLP-2010-Dandois10 #logic programming #program analysis #source code- Program analysis for code duplication in logic programs (CD), pp. 241–247.
VLDB-2009-BeskalesSIB #detection #modelling #query- Modeling and Querying Possible Repairs in Duplicate Detection (GB, MAS, IFI, SBD), pp. 598–609.
VLDB-2009-HassanzadehCML #algorithm #clustering #detection #framework- Framework for Evaluating Clustering Algorithms in Duplicate Detection (OH, FC, RJM, HCL), pp. 1282–1293.
CIKM-2009-AgarwalKLCGGHRS #normalisation #web- URL normalization for de-duplication of web pages (AA, HSK, KPL, KPC, SG, PKG, CH, AR, AS), pp. 1987–1990.
DATE-2008-KleanthousS #detection #named- CATCH: A Mechanism for Dynamically Detecting Cache-Content-Duplication and its Application to Instruction Caches (MK, YS), pp. 1426–1431.
VLDB-2008-WeisNJLS #detection- Industry-scale duplicate detection (MW, FN, UJ, JL, HS), pp. 1253–1264.
ICSM-2008-BettenburgPZK #debugging #harmful #question- Duplicate bug reports considered harmful ... really? (NB, RP, TZ, SK), pp. 337–345.
DLT-2008-ItoKKS #sequence- Duplication in DNA Sequences (MI, LK, ZK, SS), pp. 419–430.
CIKM-2008-HerschelN #detection #graph #scalability- Scaling up duplicate detection in graph data (MH, FN), pp. 1325–1326.
CIKM-2008-HuangWL #detection #precise- Achieving both high precision and high recall in near-duplicate detection (LH, LW, XL), pp. 63–72.
SIGIR-2008-TheobaldSP #detection #named #performance #robust #scalability #web- SpotSigs: robust and efficient near duplicate detection in large web collections (MT, JS, AP), pp. 563–570.
BX-2008-Matsuda1 #bidirectional #source code- Bidirectionalization of Programs with Duplication through Complement Function Derivation (KM), p. 40.
ICSE-2008-WangZXAS #approach #debugging #detection #execution #natural language #using- An approach to detecting duplicate bug reports using natural language and execution information (XW, LZ, TX, JA, JS), pp. 461–470.
DATE-2007-BaneresCK- Layout-aware gate duplication and buffer insertion (DB, JC, MK), pp. 1367–1372.
VLDB-2007-ShenZHSZ #detection #named #realtime #video- UQLIPS: A Real-time Near-duplicate Video Clip Detection System (HTS, XZ, ZH, JS, XZ), pp. 1374–1377.
DLT-2007-Leupold- Duplication Roots (PL), pp. 290–299.
CIKM-2007-LeitaoCW #detection #fuzzy #similarity #xml- Structure-based inference of xml similarity for fuzzy duplicate detection (LL, PC, MW), pp. 293–302.
SIGIR-2007-HuffmanLSWYR #detection #evaluation #multi- Multiple-signal duplicate detection for search evaluation (SBH, ARL, APS, HWT, FY, HR), pp. 223–230.
SIGIR-2007-Potthast #detection #similarity #wiki- Wikipedia in the pocket: indexing technology for near-duplicate detection and high similarity search (MP), p. 909.
ICSE-2007-RunesonAN #detection #fault #natural language #using- Detection of Duplicate Defect Reports Using Natural Language Processing (PR, MA, ON), pp. 499–510.
SIGMOD-2006-DengR #approximate #detection #streaming #using- Approximately detecting duplicates for streaming data using stable bloom filters (FD, DR), pp. 25–36.
PEPM-2006-SwadiTKP #approach #monad #staging- A monadic approach for avoiding code duplication when staging memoized functions (KNS, WT, OK, EP), pp. 160–169.
DLT-2006-ItoLS #bound- Closure of Language Classes Under Bounded Duplication (MI, PL, KST), pp. 238–247.
ICPR-v2-2006-YangQHE #adaptation #geometry #image #recognition #retrieval #using- Near-Duplicate Image Recognition and Content-based Image Retrieval using Adaptive Hierarchical Geometric Centroids (MY, GQ, JH, DE), pp. 958–961.
ICPR-v4-2006-LuoHQ #detection #image #robust- Robust Detection of Region-Duplication Forgery in Digital Image (WL, JH, GQ), pp. 746–749.
SIGIR-2006-Henzinger #algorithm #evaluation #scalability #web- Finding near-duplicate web pages: a large-scale evaluation of algorithms (MRH), pp. 284–291.
SIGIR-2006-YangC #clustering #detection- Near-duplicate detection by instance-level constrained clustering (HY, JPC), pp. 421–428.
SAC-2006-GomesSS #web- Managing duplicates in a web archive (DG, ALS, MJS), pp. 818–825.
DATE-2005-HuLDKVI #detection #fault- Compiler-Directed Instruction Duplication for Soft Error Detection (JSH, FL, VD, MTK, NV, MJI), pp. 1056–1057.
SIGMOD-2005-WeisN #xml- DogmatiX Tracks down Duplicates in XML (MW, FN), pp. 431–442.
ICSM-2005-KapserG #tool support- Improved Tool Support for the Investigation of Duplication in Software (CK, MWG), pp. 305–314.
SAS-2005-ChenKK #execution #memory management #reliability- Memory Space Conscious Loop Iteration Duplication for Reliable Execution (GC, MTK, MK), pp. 52–69.
KDD-2005-NorenOB #database #detection #safety- A hit-miss model for duplicate detection in the WHO drug safety database (GNN, RO, AB), pp. 459–468.
SIGIR-2005-FetterlyMN #detection #web- Detecting phrase-level duplication on the world wide web (DF, MM, MN), pp. 170–177.
DATE-v2-2004-SogomonyanMOG #self- A New Self-Checking Sum-Bit Duplicated Carry-Select Adder (ESS, DM, VO, MG), pp. 1360–1361.
WCRE-2004-RiegerDL- Insights into System-Wide Code Duplication (MR, SD, ML), pp. 100–109.
SIGIR-2004-ConradS #corpus #detection- Constructing a text corpus for inexact duplicate detection (JGC, CPS), pp. 582–583.
CIKM-2003-ConradGS #detection #documentation #online #reliability #retrieval- Online duplicate document detection: signature reliability in a dynamic retrieval environment (JGC, XSG, CPS), pp. 443–452.
KDD-2003-BilenkoM #adaptation #detection #metric #similarity #string #using- Adaptive duplicate detection using learnable string similarity measures (MB, RJM), pp. 39–48.
RTA-2003-KhasidashviliG #partial order #semantics #term rewriting- Stable Computational Semantics of Conflict-Free Rewrite Systems (Partial Orders with Duplication) (ZK, JRWG), pp. 467–482.
VLDB-2002-AnanthakrishnaCG #fuzzy- Eliminating Fuzzy Duplicates in Data Warehouses (RA, SC, VG), pp. 586–597.
SIGIR-2002-Yang #documentation #keyword #string- Chinese keyword extraction based on max-duplicated strings of the documents (WY), pp. 439–440.
ESOP-2001-KomondoorH #dependence #tool support #using- Tool Demonstration: Finding Duplicated Code Using Program Dependences (RK, SH), pp. 383–386.
SAS-2001-KomondoorH #identification #slicing #source code #using- Using Slicing to Identify Duplication in Source Code (RK, SH), pp. 40–56.
ICSE-2001-FioravantiMN #analysis #object-oriented #re-engineering- Reengineering Analysis of Object-Oriented Systems via Duplication (FF, GM, PN), pp. 577–586.
CC-2001-Gregg #scheduling- Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling (DG), pp. 200–212.
POPL-2000-AspertiCM #recursion- (Optimal) Duplication is not Elementary Recursive (AA, PC, SM), pp. 96–107.
SAC-2000-FeekinC #detection #sorting #using- Duplicate Detection Using K-way Sorting Method (AF, ZC), pp. 323–327.
CL-2000-KhizderTW #logic #reasoning- Reasoning about Duplicate Elimination with Description Logic (VLK, DT, GEW), pp. 1017–1032.
ICDAR-1999-LeeH #detection #documentation- Duplicate Detection in Symbolically Compressed Documents (DSL, JJH), pp. 305–308.
ICDAR-1999-Lopresti #algorithm #detection #documentation #modelling- Models and Algorithms for Duplicate Document Detection (DPL), pp. 297–300.
ICSM-1999-DucasseRD #approach #detection #independence- A Language Independent Approach for Detecting Duplicated Code (SD, MR, SD), pp. 109–118.
ICDAR-1997-DoermannLK #database #detection #documentation #image- The Detection of Duplicates in Document Image Databases (DSD, HL, OEK), pp. 314–318.
SIGMOD-1995-GriffinL #incremental #maintenance- Incremental Maintenance of Views with Duplicates (TG, LL), pp. 328–339.
VLDB-1995-YanG #information management- Duplicate Removal in Information System Dissemination (TWY, HGM), pp. 66–77.
WCRE-1995-Baker #on the #scalability- On Finding Duplication and Near-Duplication in Large Software Systems (BSB), pp. 86–95.
LICS-1995-Lynch- Paramodulation without Duplication (CL), pp. 167–177.
CIKM-1994-ArefS #database #process #proximity- Hashing by Proximity to Process Duplicates in Spatial Databases (WGA, HS), pp. 347–354.
FSE-1994-CeceFI #communication #fault- Duplication, Insertion and Lossiness Errors in Unreliable Communication Channels (GC, AF, SPI), pp. 35–43.
VLDB-1990-MumickPR- The Magic of Duplicates and Aggregates (ISM, HP, RR), pp. 264–277.
NACLP-1990-Spencer #proving- Avoiding Duplicate Proofs (BS), pp. 569–584.
VLDB-1989-SaakeLPW #information management #prototype #sorting- Sorting, Grouping and Duplicate Elimination in the Advanced Information Management Prototype (GS, VL, PP, LMW), pp. 307–316.
PODS-1982-DayalGK #algebra #relational- An Extended Relational Algebra with Control over Duplicate Elimination (UD, NG, RHK), pp. 117–123.
SOSP-1977-Ellis #consistency #correctness #database- Consistency and Correctness of Duplicate Database Systems (CAE), pp. 67–84.