118 papers:
- SANER-2015-AggarwalRTHGS #debugging #detection #re-engineering
- Detecting duplicate bug reports with software engineering domain knowledge (KA, TR, FT, AH, RG, ES), pp. 211–220.
- SCAM-2015-BoisselleA #debugging #empirical
- The impact of cross-distribution bug duplicates, empirical study on Debian and Ubuntu (VB, BA), pp. 131–140.
- MoDELS-2015-RagoMD #case study #identification #semantics
- Identifying duplicate functionality in textual use cases by aligning semantic actions (SoSyM abstract) (AR, CM, JADP), p. 446.
- SAC-2015-BezuBRVVF #component #detection #multi #similarity #web
- Multi-component similarity method for web product duplicate detection (RvB, SB, RR, JV, DV, FF), pp. 761–768.
- ASE-2014-ThungKL #debugging #detection #named #tool support
- DupFinder: integrated tool support for duplicate bug report detection (FT, PSK, DL), pp. 871–874.
- MSR-2014-KleinCK #debugging #detection
- New features for duplicate bug detection (NK, CSC, NAK), pp. 324–327.
- MSR-2014-LazarRS #debugging #detection #metric #similarity #using
- Improving the accuracy of duplicate bug report detection using textual similarity measures (AL, SR, BS), pp. 308–311.
- MSR-2014-LazarRS14a #dataset #debugging #generative
- Generating duplicate bug datasets (AL, SR, BS), pp. 392–395.
- CIAA-2014-DumitranGMM #bound
- Bounded Prefix-Suffix Duplication (MD, JG, FM, VM), pp. 176–187.
- CIKM-2014-HeiseKN #clustering
- Estimating the Number and Sizes of Fuzzy-Duplicate Clusters (AH, GK, FN), pp. 959–968.
- ICPR-2014-LiuLS #documentation #image #novel
- Novel Global and Local Features for Near-Duplicate Document Image Matching (LL, YL, CYS), pp. 4624–4629.
- ICPR-2014-NegrelPG #image #learning #metric #performance #reduction #retrieval #using
- Efficient Metric Learning Based Dimension Reduction Using Sparse Projectors for Image Near Duplicate Retrieval (RN, DP, PHG), pp. 738–743.
- ICPR-2014-YangLLZ #consistency #geometry #image #rank
- Low Rank Global Geometric Consistency for Partial-Duplicate Image Search (LY, YL, ZL, HZ), pp. 3939–3944.
- SIGIR-2014-BaruahRS
- The effect of expanding relevance judgements with duplicates (GB, AR, MDS), pp. 1159–1162.
- DocEng-2013-WilliamsG #detection #library
- Near duplicate detection in an academic digital library (KW, CLG), pp. 91–94.
- ICDAR-2013-LiuLSX #documentation #image #modelling #retrieval #word
- Modeling Local Word Spatial Configurations for Near Duplicate Document Image Retrieval (LL, YL, CYS, JX), pp. 235–239.
- VLDB-2013-DuttaNB #approach #approximate #data type #detection #streaming
- Streaming Quotient Filter: A Near Optimal Approximate Duplicate Detection Approach for Data Streams (SD, AN, SKB), pp. 589–600.
- CSMR-2013-LerchM #debugging
- Finding Duplicates of Your Yet Unwritten Bug Report (JL, MM), pp. 69–78.
- MSR-2013-AlipourHS #approach #debugging #detection #towards
- A contextual approach towards more accurate duplicate bug report detection (AA, AH, ES), pp. 183–192.
- MSR-2013-AmouiKATLL #detection #experience #fault #industrial #search-based
- Search-based duplicate defect detection: an industrial experience (MA, NK, AAD, LT, SL, WL), pp. 173–182.
- LATA-2013-BenzaidDE #algorithm #complexity
- Duplication-Loss Genome Alignment: Complexity and Algorithm (BB, RD, NEM), pp. 116–127.
- CAiSE-2013-BakkerFV #approach #detection #hybrid #web
- A Hybrid Model Words-Driven Approach for Web Product Duplicate Detection (MdB, FF, DV), pp. 149–161.
- ECIR-2013-IgnatovKC #approach #detection
- Near-Duplicate Detection for Online-Shops Owners: An FCA-Based Approach (DII, AVK, YC), pp. 722–725.
- ICML-c3-2013-FriedlandJL #detection #social
- Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events (LF, DJ, ML), pp. 1175–1183.
- SAC-2013-BakkerVFK #detection #web
- Model words-driven approaches for duplicate detection on the web (MdB, DV, FF, UK), pp. 717–723.
- SAC-2013-LeitaoC #adaptation #detection #optimisation #performance #using #xml
- Efficient XML duplicate detection using an adaptive two-level optimization (LL, PC), pp. 832–837.
- ASE-2012-NguyenNNLS #debugging #detection #information retrieval #modelling #topic
- Duplicate bug report detection with a combination of information retrieval and topic modeling (ATN, TTN, TNN, DL, CS), pp. 70–79.
- CSMR-2012-KaushikT #case study #comparative #debugging #detection #information retrieval #modelling #performance
- A Comparative Study of the Performance of IR Models on Duplicate Bug Detection (NK, LT), pp. 159–168.
- CSMR-2012-TianSL #debugging #identification
- Improved Duplicate Bug Report Identification (YT, CS, DL), pp. 385–390.
- CIKM-2012-SarmaJMB #automation #scalability
- An automatic blocking mechanism for large-scale de-duplication tasks (ADS, AJ, AM, PB), pp. 1055–1064.
- CIKM-2012-ZhouZ #debugging #learning #rank
- Learning to rank duplicate bug reports (JZ, HZ), pp. 852–861.
- ICPR-2012-VitaladevuniCPN #detection #documentation #image #using
- Detecting near-duplicate document images using interest point matching (SNPV, FC, RP, PN), pp. 347–350.
- ICSE-2012-DangWZZN #clustering #named #similarity #stack
- ReBucket: A method for clustering duplicate crash reports based on call stack similarity (YD, RW, HZ, DZ, PN), pp. 1084–1093.
- ASE-2011-SunLKJ #debugging #retrieval #towards
- Towards more accurate retrieval of duplicate bug reports (CS, DL, SCK, JJ), pp. 253–262.
- PODS-2011-JowhariST #bound #problem
- Tight bounds for Lp samplers, finding duplicates in streams, and related problems (HJ, MS, GT), pp. 49–58.
- AFL-2011-Ito #strict
- K-Restricted Duplication Closure of Languages (MI), pp. 28–33.
- CIKM-2011-KoloniariNPS #distributed
- One is enough: distributed filtering for duplicate elimination (GK, NN, EP, DS), pp. 433–442.
- CIKM-2011-LangeN #metric #similarity #why
- Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate (DL, FN), pp. 243–248.
- CIKM-2011-LeitaoC #detection #optimisation
- Duplicate detection through structure optimization (LL, PC), pp. 443–452.
- CIKM-2011-SoodL #detection #probability #using
- Probabilistic near-duplicate detection using simhash (SS, DL), pp. 1117–1126.
- CIKM-2011-YalnizCM #detection #scalability
- Partial duplicate detection for large book collections (IZY, EFC, RM), pp. 469–474.
- SAC-2011-WuW #framework
- A data de-duplication access framework for solid state drives (CHW, HSW), pp. 600–604.
- SIGMOD-2010-WangWLWWLTXL #dataset #detection #named
- MapDupReducer: detecting near duplicates over massive datasets (CW, JW, XL, WW, HW, HL, WT, JX, RL), pp. 1119–1122.
- CSMR-2010-CavalcantiACLM #debugging #problem
- An Initial Study on the Bug Report Duplication Problem (YCC, ESdA, CEAdC, DL, SRdLM), pp. 264–267.
- CAiSE-2010-HordijkW #case study
- Rationality of Cross-System Data Duplication: A Case Study (WH, RW), pp. 68–82.
- ECIR-2010-KimCLL #detection #effectiveness #image #named #sequence #using
- BASIL: Effective Near-Duplicate Image Detection Using Gene Sequence Alignment (HsK, HWC, JL, DL), pp. 229–240.
- ICPR-2010-BalujaC #learning #performance #retrieval
- Beyond “Near Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval (SB, MC), pp. 543–547.
- ICPR-2010-HarmanciH #adaptation #image #query
- Content Adaptive Hash Lookups for Near-Duplicate Image Search by Full or Partial Image Queries (OH, IH), pp. 1582–1585.
- ICPR-2010-IdeSDTM #classification #video
- Classification of Near-Duplicate Video Segments Based on Their Appearance Patterns (II, YS, DD, TT, HM), pp. 3129–3133.
- ICPR-2010-WuXJHCL #constraints #geometry #image #invariant #retrieval
- Adding Affine Invariant Geometric Constraint for Partial-Duplicate Image Retrieval (ZW, QX, SJ, QH, PC, LL), pp. 842–845.
- SIGIR-2010-HajishirziYK #adaptation #detection #learning #similarity
- Adaptive near-duplicate detection via similarity learning (HH, WtY, AK), pp. 419–426.
- SIGIR-2010-ZhangZYH #detection #performance #sequence
- Efficient partial-duplicate detection based on sequence matching (QZ, YZ, HY, XH), pp. 675–682.
- ICSE-2010-SongWXZM #debugging #detection #named
- JDF: detecting duplicate bug reports in Jazz (YS, XW, TX, LZ, HM), pp. 315–316.
- ICSE-2010-SunLWJK #approach #debugging #retrieval
- A discriminative model approach for accurate duplicate bug report retrieval (CS, DL, XW, JJ, SCK), pp. 45–54.
- ICLP-2010-Dandois10 #logic programming #program analysis #source code
- Program analysis for code duplication in logic programs (CD), pp. 241–247.
- VLDB-2009-BeskalesSIB #detection #modelling #query
- Modeling and Querying Possible Repairs in Duplicate Detection (GB, MAS, IFI, SBD), pp. 598–609.
- VLDB-2009-HassanzadehCML #algorithm #clustering #detection #framework
- Framework for Evaluating Clustering Algorithms in Duplicate Detection (OH, FC, RJM, HCL), pp. 1282–1293.
- CIKM-2009-AgarwalKLCGGHRS #normalisation #web
- URL normalization for de-duplication of web pages (AA, HSK, KPL, KPC, SG, PKG, CH, AR, AS), pp. 1987–1990.
- DATE-2008-KleanthousS #detection #named
- CATCH: A Mechanism for Dynamically Detecting Cache-Content-Duplication and its Application to Instruction Caches (MK, YS), pp. 1426–1431.
- VLDB-2008-WeisNJLS #detection
- Industry-scale duplicate detection (MW, FN, UJ, JL, HS), pp. 1253–1264.
- ICSM-2008-BettenburgPZK #debugging #harmful #question
- Duplicate bug reports considered harmful ... really? (NB, RP, TZ, SK), pp. 337–345.
- DLT-2008-ItoKKS #sequence
- Duplication in DNA Sequences (MI, LK, ZK, SS), pp. 419–430.
- CIKM-2008-HerschelN #detection #graph #scalability
- Scaling up duplicate detection in graph data (MH, FN), pp. 1325–1326.
- CIKM-2008-HuangWL #detection #precise
- Achieving both high precision and high recall in near-duplicate detection (LH, LW, XL), pp. 63–72.
- SIGIR-2008-TheobaldSP #detection #named #performance #robust #scalability #web
- SpotSigs: robust and efficient near duplicate detection in large web collections (MT, JS, AP), pp. 563–570.
- BX-2008-Matsuda1 #bidirectional #source code
- Bidirectionalization of Programs with Duplication through Complement Function Derivation (KM), p. 40.
- ICSE-2008-WangZXAS #approach #debugging #detection #execution #natural language #using
- An approach to detecting duplicate bug reports using natural language and execution information (XW, LZ, TX, JA, JS), pp. 461–470.
- DATE-2007-BaneresCK
- Layout-aware gate duplication and buffer insertion (DB, JC, MK), pp. 1367–1372.
- VLDB-2007-ShenZHSZ #detection #named #realtime #video
- UQLIPS: A Real-time Near-duplicate Video Clip Detection System (HTS, XZ, ZH, JS, XZ), pp. 1374–1377.
- DLT-2007-Leupold
- Duplication Roots (PL), pp. 290–299.
- CIKM-2007-LeitaoCW #detection #fuzzy #similarity #xml
- Structure-based inference of xml similarity for fuzzy duplicate detection (LL, PC, MW), pp. 293–302.
- SIGIR-2007-HuffmanLSWYR #detection #evaluation #multi
- Multiple-signal duplicate detection for search evaluation (SBH, ARL, APS, HWT, FY, HR), pp. 223–230.
- SIGIR-2007-Potthast #detection #similarity #wiki
- Wikipedia in the pocket: indexing technology for near-duplicate detection and high similarity search (MP), p. 909.
- ICSE-2007-RunesonAN #detection #fault #natural language #using
- Detection of Duplicate Defect Reports Using Natural Language Processing (PR, MA, ON), pp. 499–510.
- SIGMOD-2006-DengR #approximate #detection #streaming #using
- Approximately detecting duplicates for streaming data using stable bloom filters (FD, DR), pp. 25–36.
- PEPM-2006-SwadiTKP #approach #monad #staging
- A monadic approach for avoiding code duplication when staging memoized functions (KNS, WT, OK, EP), pp. 160–169.
- DLT-2006-ItoLS #bound
- Closure of Language Classes Under Bounded Duplication (MI, PL, KST), pp. 238–247.
- ICPR-v2-2006-YangQHE #adaptation #geometry #image #recognition #retrieval #using
- Near-Duplicate Image Recognition and Content-based Image Retrieval using Adaptive Hierarchical Geometric Centroids (MY, GQ, JH, DE), pp. 958–961.
- ICPR-v4-2006-LuoHQ #detection #image #robust
- Robust Detection of Region-Duplication Forgery in Digital Image (WL, JH, GQ), pp. 746–749.
- SIGIR-2006-Henzinger #algorithm #evaluation #scalability #web
- Finding near-duplicate web pages: a large-scale evaluation of algorithms (MRH), pp. 284–291.
- SIGIR-2006-YangC #clustering #detection
- Near-duplicate detection by instance-level constrained clustering (HY, JPC), pp. 421–428.
- SAC-2006-GomesSS #web
- Managing duplicates in a web archive (DG, ALS, MJS), pp. 818–825.
- DATE-2005-HuLDKVI #detection #fault
- Compiler-Directed Instruction Duplication for Soft Error Detection (JSH, FL, VD, MTK, NV, MJI), pp. 1056–1057.
- SIGMOD-2005-WeisN #xml
- DogmatiX Tracks down Duplicates in XML (MW, FN), pp. 431–442.
- ICSM-2005-KapserG #tool support
- Improved Tool Support for the Investigation of Duplication in Software (CK, MWG), pp. 305–314.
- SAS-2005-ChenKK #execution #memory management #reliability
- Memory Space Conscious Loop Iteration Duplication for Reliable Execution (GC, MTK, MK), pp. 52–69.
- KDD-2005-NorenOB #database #detection #safety
- A hit-miss model for duplicate detection in the WHO drug safety database (GNN, RO, AB), pp. 459–468.
- SIGIR-2005-FetterlyMN #detection #web
- Detecting phrase-level duplication on the world wide web (DF, MM, MN), pp. 170–177.
- DATE-v2-2004-SogomonyanMOG #self
- A New Self-Checking Sum-Bit Duplicated Carry-Select Adder (ESS, DM, VO, MG), pp. 1360–1361.
- WCRE-2004-RiegerDL
- Insights into System-Wide Code Duplication (MR, SD, ML), pp. 100–109.
- SIGIR-2004-ConradS #corpus #detection
- Constructing a text corpus for inexact duplicate detection (JGC, CPS), pp. 582–583.
- CIKM-2003-ConradGS #detection #documentation #online #reliability #retrieval
- Online duplicate document detection: signature reliability in a dynamic retrieval environment (JGC, XSG, CPS), pp. 443–452.
- KDD-2003-BilenkoM #adaptation #detection #metric #similarity #string #using
- Adaptive duplicate detection using learnable string similarity measures (MB, RJM), pp. 39–48.
- RTA-2003-KhasidashviliG #partial order #semantics #term rewriting
- Stable Computational Semantics of Conflict-Free Rewrite Systems (Partial Orders with Duplication) (ZK, JRWG), pp. 467–482.
- VLDB-2002-AnanthakrishnaCG #fuzzy
- Eliminating Fuzzy Duplicates in Data Warehouses (RA, SC, VG), pp. 586–597.
- SIGIR-2002-Yang #documentation #keyword #string
- Chinese keyword extraction based on max-duplicated strings of the documents (WY), pp. 439–440.
- ESOP-2001-KomondoorH #dependence #tool support #using
- Tool Demonstration: Finding Duplicated Code Using Program Dependences (RK, SH), pp. 383–386.
- SAS-2001-KomondoorH #identification #slicing #source code #using
- Using Slicing to Identify Duplication in Source Code (RK, SH), pp. 40–56.
- ICSE-2001-FioravantiMN #analysis #object-oriented #re-engineering
- Reengineering Analysis of Object-Oriented Systems via Duplication (FF, GM, PN), pp. 577–586.
- CC-2001-Gregg #scheduling
- Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling (DG), pp. 200–212.
- POPL-2000-AspertiCM #recursion
- (Optimal) Duplication is not Elementary Recursive (AA, PC, SM), pp. 96–107.
- SAC-2000-FeekinC #detection #sorting #using
- Duplicate Detection Using K-way Sorting Method (AF, ZC), pp. 323–327.
- CL-2000-KhizderTW #logic #reasoning
- Reasoning about Duplicate Elimination with Description Logic (VLK, DT, GEW), pp. 1017–1032.
- ICDAR-1999-LeeH #detection #documentation
- Duplicate Detection in Symbolically Compressed Documents (DSL, JJH), pp. 305–308.
- ICDAR-1999-Lopresti #algorithm #detection #documentation #modelling
- Models and Algorithms for Duplicate Document Detection (DPL), pp. 297–300.
- ICSM-1999-DucasseRD #approach #detection #independence
- A Language Independent Approach for Detecting Duplicated Code (SD, MR, SD), pp. 109–118.
- ICDAR-1997-DoermannLK #database #detection #documentation #image
- The Detection of Duplicates in Document Image Databases (DSD, HL, OEK), pp. 314–318.
- SIGMOD-1995-GriffinL #incremental #maintenance
- Incremental Maintenance of Views with Duplicates (TG, LL), pp. 328–339.
- VLDB-1995-YanG #information management
- Duplicate Removal in Information System Dissemination (TWY, HGM), pp. 66–77.
- WCRE-1995-Baker #on the #scalability
- On Finding Duplication and Near-Duplication in Large Software Systems (BSB), pp. 86–95.
- LICS-1995-Lynch
- Paramodulation without Duplication (CL), pp. 167–177.
- CIKM-1994-ArefS #database #process #proximity
- Hashing by Proximity to Process Duplicates in Spatial Databases (WGA, HS), pp. 347–354.
- FSE-1994-CeceFI #communication #fault
- Duplication, Insertion and Lossiness Errors in Unreliable Communication Channels (GC, AF, SPI), pp. 35–43.
- VLDB-1990-MumickPR
- The Magic of Duplicates and Aggregates (ISM, HP, RR), pp. 264–277.
- NACLP-1990-Spencer #proving
- Avoiding Duplicate Proofs (BS), pp. 569–584.
- VLDB-1989-SaakeLPW #information management #prototype #sorting
- Sorting, Grouping and Duplicate Elimination in the Advanced Information Management Prototype (GS, VL, PP, LMW), pp. 307–316.
- PODS-1982-DayalGK #algebra #relational
- An Extended Relational Algebra with Control over Duplicate Elimination (UD, NG, RHK), pp. 117–123.
- SOSP-1977-Ellis #consistency #correctness #database
- Consistency and Correctness of Duplicate Database Systems (CAE), pp. 67–84.