Note
mimansajaiswal: how are duplicates identified?
GitHub Copilot: Based on the code, duplicates seem to be identified through multiple methods and components. Here's the analysis:
-
Primary identification is done by Zotero's built-in duplicate detection system through the duplicates pane.
-
In
bulkDuplicates.ts
, duplicates are fetched using afetchDuplicates()
helper function that returns:
{
duplicatesObj, // Contains methods like getSetItemsByItemID()
duplicateItems // Array of duplicate item IDs
}
- The
DuplicateItems
class induplicateItems.ts
is used to manage sets of duplicate items:
export class DuplicateItems {
// ...existing code...
constructor(items: Zotero.Item[] | number[], masterItemPref: MasterItem) {
// Takes an array of items that are duplicates of each other
this._items = items.map((item) => (typeof item === "number" ? Zotero.Items.get(item) : item));
// ...existing code...
}
}
- In
utils.ts
, there are helper functions to clean and normalize item fields for comparison:
Made changes.
export function cleanDOI(item: Zotero.Item): string[] {
const possibleDOIFields: Zotero.Item.ItemField[] = ["DOI", "url"];
const doiStrs = new Set<string>();
// Clean and normalize DOI fields
// ...existing code...
}
export function cleanISBN(item: Zotero.Item): string[] {
const possibleISBNFields: Zotero.Item.ItemField[] = ["DOI", "ISBN", "url"];
// Clean and normalize ISBN fields
// ...existing code...
}
export function normalizeString(input: string, wildcard = "%") {
// Normalize strings for comparison by:
// - Converting to uppercase
// - Replacing non-letter chars with wildcards
return ("" + input)
.replace(/[^a-zA-Z]+/g, wildcard)
.trim()
.toUpperCase();
}
The core duplicate detection appears to be handled by Zotero's internal duplicate detection system, while this plugin provides:
- Additional normalization and cleaning of fields like DOI and ISBN
- Utilities to manage and process sets of duplicate items
- UI and workflow for handling duplicates
- Ability to mark false positives as "non-duplicates"
The exact matching criteria used by Zotero's duplicate detection isn't visible in the provided code, but this plugin builds on top of it with additional field normalization and management features.
Made changes.
Note
mimansajaiswal: when and how are the items merged? are they ever not merged?
GitHub Copilot: Looking through the code, merging happens in a few different scenarios and is handled by a merge function called from multiple places. Let me break it down:
The key merge scenarios are:
- When duplicates are detected during item import:
// ...existing code...
// User can choose:
- "Keep New": Merge with new item as master
- "Keep Old": Merge with existing item as master
- "Keep All": No merging, keep both items
- "Merge Manually": No automatic merging, user handles it manually
// ...existing code...
- During bulk merge operations:
Made changes.
private async bulkMergeDuplicates() {
// ...existing code...
for (let i = 0; i < duplicateItems.length; i++) {
// ...existing code...
const duItems = new DuplicateItems(items, masterItemPref);
// ...existing code...
const masterItem = duItems.masterItem;
const otherItems = duItems.otherItems;
await merge(masterItem, otherItems);
// ...existing code...
}
// ...existing code...
}
Items are NOT merged in these cases:
- When user chooses "Keep All" action
- When user chooses "Merge Manually" to handle it themselves
- When items are marked as non-duplicates:
Made changes.
// When items are marked as non-duplicates they are excluded from auto-merging
export async function toggleNonDuplicates(action: "mark" | "unmark", items?: number[] | Zotero.Item[]) {
// ...existing code...
if (action === "mark") {
await NonDuplicatesDB.instance.insertNonDuplicates(itemIDs, Zotero.getActiveZoteroPane().getSelectedLibraryID());
} else if (action === "unmark") {
await NonDuplicatesDB.instance.deleteNonDuplicates(itemIDs);
}
// ...existing code...
}
- When the bulk merge operation is suspended or canceled:
Made changes.
if (!this._isRunning) {
const result = Zotero.Prompt.confirm({
// ...existing code...
});
if (result == 0) {
// Resume merging
restoreCheckbox.value = false;
this.isRunning = true;
} else {
// Cancel merging
toCancel = true;
break;
}
}
The actual merging is done through Zotero's built-in merge functionality while preserving metadata like notes, tags, and attachments. The code implements a smart merge strategy that retains the most complete information from all duplicate items.
Note
mimansajaiswal: how is the actual merge performed?
GitHub Copilot: Looking through the code, the actual merging is performed in the merger.ts
file (which isn't shown in the provided files). However, I can piece together how merging works from the usage patterns:
- The merge is initiated from a few different places:
private async processDuplicates(duplicateMaps: Map<number, { existingItemIDs: number[]; action: Action }>) {
// ...existing code...
for (const [newItemID, { existingItemIDs, action }] of duplicateMaps) {
if (action === Action.KEEP) {
items.push({
masterItem: newItem,
otherItems: existingItemIDs.map((id) => Zotero.Items.get(id)),
});
} else if (action === Action.DISCARD) {
// ...existing code...
items.push({
masterItem: masterItem,
otherItems: [...otherItems, newItem],
});
}
}
// ...existing code...
for (const { masterItem, otherItems } of items) {
await merge(masterItem, otherItems); // This performs the actual merge
}
// ...existing code...
}
- The choice of which item becomes the master item is determined by the
DuplicateItems
class:
Made changes.
private analyze() {
let compare: (a: Zotero.Item, b: Zotero.Item) => number;
switch (this._masterItemPref) {
case MasterItem.OLDEST:
compare = (a, b) => (b.dateAdded < a.dateAdded ? 1 : -1);
break;
case MasterItem.NEWEST:
compare = (a, b) => (b.dateAdded > a.dateAdded ? 1 : -1);
break;
case MasterItem.MODIFIED:
compare = (a, b) => (b.dateModified > a.dateModified ? 1 : -1);
break;
case MasterItem.DETAILED:
compare = (a, b) => {
const fieldDiff = b.getUsedFields(false).length - a.getUsedFields(false).length;
if (fieldDiff !== 0) return fieldDiff;
return b.dateAdded < a.dateAdded ? 1 : -1;
};
break;
}
this._items.sort(compare);
this._masterItem = this._items[0];
}
- The merge happens in a few different scenarios:
- When manually handling individual duplicates
- During bulk merge operations
- When auto-merging based on user preferences
Based on comments in the code, the merging process:
- Takes a master item and a list of other duplicate items
- Preserves all metadata like notes, tags, attachments
- Extracts useful information from duplicate items into the master item
- Uses Zotero's built-in merge functionality with some improvements
- Deletes the duplicate items after merging their data into the master
The actual merge implementation would be in a separate file, but this is the high-level flow of how items get merged. The most significant aspect is that it's not just deleting duplicates - it intelligently combines metadata from all duplicates into a single master record.
Let me know if you'd like me to explain any specific part in more detail!
Made changes.
Note
mimansajaiswal: here you go how is the merge performed now?
GitHub Copilot: Now I can see the actual merge implementation in the code. Here's how merging works:
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[], // Already sorted
): Promise<any> {
Zotero.CollectionTreeCache.clear();
// Only merge items of the same type
const masterItemType = masterItem.itemTypeID;
otherItems = otherItems.filter((item) => item.itemTypeID === masterItemType);
if (otherItems.length === 0) {
return;
}
// Combine JSON data from all items
const masterJSON = masterItem.toJSON();
const candidateJSON: {
[field in Zotero.Item.DataType]?: string | unknown;
} = otherItems.reduce((acc, obj) => ({ ...acc, ...obj.toJSON() }), {});
// Exclude certain properties that are empty in the cloned object
// so we don't clobber them
const { relations, collections, tags, ...keep } = candidateJSON;
// Merge the data, giving priority to master item's data
masterItem.fromJSON({ ...keep, ...masterJSON });
// Use Zotero's built-in merge to handle attachments, notes, etc
return await Zotero.Items.merge(masterItem, otherItems);
}
The merge process works in several steps:
-
Clears the collection tree cache first
-
Filters out any items that aren't the same type as the master item
-
Combines metadata by:
- Converting all items to JSON format
- Merging the JSON data from other items into a combined object
- Excluding certain sensitive properties (relations, collections, tags)
- Applying the merged data back to the master item, with master's data taking precedence
-
Finally calls Zotero's built-in merge function to handle:
- Attachments
- Notes
- Tags
- Collections
- Relations
- Other Zotero-specific item data
This approach ensures that:
- All metadata is preserved and combined
- Master item's data takes priority
- Zotero's native merge handling is used for special item types
- Only items of the same type are merged
- Collection caches are properly updated
The merge is called from various places in the code when duplicates need to be handled, but the actual merging logic is centralized in this one function.
Made changes.
Note
mimansajaiswal: here is some code from zotero -- how are they internally doing it?
/* ***** BEGIN LICENSE BLOCK *****
Copyright © 2024 Corporation for Digital Scholarship Vienna, Virginia, USA https://www.zotero.org
This file is part of Zotero.
Zotero is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Zotero is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with Zotero. If not, see http://www.gnu.org/licenses/.
***** END LICENSE BLOCK ***** */
{ class DuplicatesMergePane extends XULElementBase { content = MozXULElement.parseXULToFragment(`
<groupbox id="zotero-duplicates-merge-version-select">
<description>&zotero.duplicatesMerge.versionSelect;</description>
<hbox>
<richlistbox id="zotero-duplicates-merge-original-date" rows="0"/>
</hbox>
</groupbox>
<groupbox id="zotero-duplicates-merge-field-select">
<description>&zotero.duplicatesMerge.fieldSelect;</description>
</groupbox>
<vbox id="zotero-duplicates-merge-info-box-container">
<info-box id="zotero-duplicates-merge-info-box"/>
</vbox>
`, ['chrome://zotero/locale/zotero.dtd']);
init() {
this._masterItem = null;
this._items = [];
this._otherItems = [];
this._ignoreFields = ['dateAdded', 'dateModified', 'accessDate'];
this.querySelector("#zotero-duplicates-merge-button").addEventListener(
"command", () => this.merge());
this.querySelector("#zotero-duplicates-merge-original-date").addEventListener(
"select", event => this.setMaster(event.target.selectedIndex));
}
setItems(items, displayNumItemsOnTypeError) {
let itemTypeID, oldestItem, otherItems = [];
for (let item of items) {
// Find the oldest item
if (!oldestItem) {
oldestItem = item;
}
else if (item.dateAdded < oldestItem.dateAdded) {
otherItems.push(oldestItem);
oldestItem = item;
}
else {
otherItems.push(item);
}
if (!item.isRegularItem() || ['annotation', 'attachment', 'note'].includes(item.itemType)) {
let msg = Zotero.getString('pane.item.duplicates.onlyTopLevel');
ZoteroPane.itemPane.setItemPaneMessage(msg);
return false;
}
// Make sure all items are of the same type
if (itemTypeID) {
if (itemTypeID != item.itemTypeID) {
let msg;
if (displayNumItemsOnTypeError) {
msg = { l10nId: 'item-pane-message-items-selected', l10nArgs: { count: items.length } };
}
else {
msg = Zotero.getString('pane.item.duplicates.onlySameItemType');
}
ZoteroPane.itemPane.setItemPaneMessage(msg);
return false;
}
}
else {
itemTypeID = item.itemTypeID;
}
}
this._items = items;
this._items.sort(function (a, b) {
return a.dateAdded > b.dateAdded ? 1 : a.dateAdded == b.dateAdded ? 0 : -1;
});
//
// Update the UI
//
let button = document.getElementById('zotero-duplicates-merge-button');
let versionSelect = document.getElementById('zotero-duplicates-merge-version-select');
let itembox = document.getElementById('zotero-duplicates-merge-info-box');
let fieldSelect = document.getElementById('zotero-duplicates-merge-field-select');
let alternatives = oldestItem.multiDiff(otherItems, this._ignoreFields);
if (alternatives) {
// Populate menulist with Date Added values from all items
let dateList = document.getElementById('zotero-duplicates-merge-original-date');
dateList.innerHTML = '';
let numRows = 0;
for (let item of items) {
let date = Zotero.Date.sqlToDate(item.dateAdded, true);
dateList.appendItem(date.toLocaleString());
numRows++;
}
dateList.setAttribute('rows', numRows);
// If we set this inline, the selection doesn't take on the first
// selection after unhiding versionSelect (when clicking
// from a set with no differences) -- tested in Fx5.0.1
setTimeout(function () {
dateList.selectedIndex = 0;
}, 0);
}
document.l10n.setArgs(button, { count: otherItems.length + 1 });
versionSelect.hidden = fieldSelect.hidden = !alternatives;
itembox.hiddenFields = alternatives ? [] : ['dateAdded', 'dateModified'];
// Since the header of the collapsible section is hidden, the section has to be opened
itembox.open = true;
this.setMaster(0);
return true;
}
setMaster(pos) {
let itembox = document.getElementById('zotero-duplicates-merge-info-box');
itembox.mode = 'fieldmerge';
this._otherItems = this._items.concat();
let item = this._otherItems.splice(pos, 1)[0];
// Add master item's values to the beginning of each set of
// alternative values so that they're still available if the item box
// modifies the item
let alternatives = item.multiDiff(this._otherItems, this._ignoreFields);
if (alternatives) {
let itemValues = item.toJSON();
for (let i in alternatives) {
alternatives[i].unshift(itemValues[i] !== undefined ? itemValues[i] : '');
}
itembox.fieldAlternatives = alternatives;
}
this._masterItem = item;
itembox.item = item.clone();
// The item.id is null which equals to _lastRenderItemID, so we need to force render it
itembox._forceRenderAll();
}
async merge() {
let itembox = document.getElementById('zotero-duplicates-merge-info-box');
Zotero.CollectionTreeCache.clear();
// Update master item with any field alternatives from the item box
let json = this._masterItem.toJSON();
// Exclude certain properties that are empty in the cloned object, so we don't clobber them
const { relations: _r, collections: _c, tags: _t, ...keep } = itembox.item.toJSON();
Object.assign(json, keep);
this._masterItem.fromJSON(json);
Zotero.Items.merge(this._masterItem, this._otherItems);
}
} customElements.define("duplicates-merge-pane", DuplicatesMergePane); }
GitHub Copilot: Looking at Zotero's internal code, there are some key differences in how they handle merging compared to the plugin's implementation:
- Field-level merging:
// Zotero's approach allows per-field merging through UI
setMaster(pos) {
// ...existing code...
// Get differences between master and other items per field
let alternatives = item.multiDiff(this._otherItems, this._ignoreFields);
if (alternatives) {
let itemValues = item.toJSON();
for (let i in alternatives) {
// Keep master item's values available as first option
alternatives[i].unshift(itemValues[i] !== undefined ? itemValues[i] : '');
}
itembox.fieldAlternatives = alternatives;
}
// ...existing code...
}
- Item validation:
// Zotero does strict validation before allowing merge
setItems(items, displayNumItemsOnTypeError) {
// ...existing code...
if (!item.isRegularItem() || ['annotation', 'attachment', 'note'].includes(item.itemType)) {
let msg = Zotero.getString('pane.item.duplicates.onlyTopLevel');
ZoteroPane.itemPane.setItemPaneMessage(msg);
return false;
}
// Make sure all items are of the same type
if (itemTypeID && itemTypeID != item.itemTypeID) {
// ...show error...
return false;
}
// ...existing code...
}
We could improve the plugin's merge function:
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[],
): Promise<any> {
Zotero.CollectionTreeCache.clear();
// Validate items
const masterItemType = masterItem.itemTypeID;
otherItems = otherItems.filter(item =>
item.itemTypeID === masterItemType &&
item.isRegularItem() &&
!['annotation', 'attachment', 'note'].includes(item.itemType)
);
if (otherItems.length === 0) return;
// Get field differences first
const alternatives = masterItem.multiDiff(otherItems, ['dateAdded', 'dateModified', 'accessDate']);
if (alternatives) {
const masterJSON = masterItem.toJSON();
// Add master values as first option for each field
for (let field in alternatives) {
alternatives[field].unshift(masterJSON[field] !== undefined ? masterJSON[field] : '');
}
// Note: We could potentially add UI here to let users choose per field
}
// Merge the data
const candidateJSON = otherItems.reduce((acc, obj) => ({ ...acc, ...obj.toJSON() }), {});
const { relations, collections, tags, ...keep } = candidateJSON;
masterItem.fromJSON({ ...keep, ...masterJSON });
return await Zotero.Items.merge(masterItem, otherItems);
}
The key differences are:
- Zotero allows field-by-field merging through UI
- Zotero does stricter validation of mergeable items
- Zotero keeps master item's field values available as options
- Zotero shows differences between items before merging
- The plugin currently does a more automated merge without field-level control
The plugin could potentially be enhanced by:
- Adding field-level merge controls in the UI
- Implementing stricter item validation
- Showing field differences before merging
- Preserving more granular control over which values to keep
Let me know if you would like me to elaborate on implementing any of these improvements!
Made changes.
Note
mimansajaiswal: how are items of different item types handled in the plugin?
GitHub Copilot: Looking at the code, items of different types are handled in a few places:
- During merge, items of different types are filtered out:
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[],
): Promise<any> {
// Only merge items of the same type
const masterItemType = masterItem.itemTypeID;
otherItems = otherItems.filter((item) => item.itemTypeID === masterItemType);
if (otherItems.length === 0) {
return;
}
// ...existing code...
}
This is less strict than Zotero's internal handling, which does validation upfront:
// From Zotero's internal code
setItems(items, displayNumItemsOnTypeError) {
// ...existing code...
// Make sure all items are of the same type
if (itemTypeID) {
if (itemTypeID != item.itemTypeID) {
let msg;
if (displayNumItemsOnTypeError) {
msg = { l10nId: 'item-pane-message-items-selected', l10nArgs: { count: items.length } };
}
else {
msg = Zotero.getString('pane.item.duplicates.onlyTopLevel');
}
ZoteroPane.itemPane.setItemPaneMessage(msg);
return false;
}
}
// ...existing code...
}
The main differences are:
- The plugin silently filters out items of different types during merge
- Zotero shows an error message and prevents the merge entirely
- The plugin doesn't validate item types during selection, only at merge time
To align better with Zotero's handling, the plugin could be improved by:
- Moving type validation earlier in the process
- Adding error messages for mismatched types
- Preventing merge attempts between different types
However, the current implementation ensures items can only be merged with others of the same type, just with less strict validation and user feedback.
Let me know if you would like me to show how to implement any of these improvements!
Made changes.
Note
mimansajaiswal: what is being called here? return await Zotero.Items.merge(masterItem, otherItems); what is this merge function now?
GitHub Copilot: Looking at the Zotero types definition file provided, Zotero.Items.merge is a core Zotero function defined in the Items interface:
interface Items extends DataObjects {
// ...existing code...
merge(item: Zotero.Item, otherItems: Zotero.Item[]): Promise<any>;
// ...existing code...
}
This is Zotero's internal merge function that handles:
- From the source code you shared, before calling merge, Zotero first:
// From Zotero's DuplicatesMergePane class
async merge() {
let itembox = document.getElementById('zotero-duplicates-merge-info-box');
Zotero.CollectionTreeCache.clear();
// Update master item with any field alternatives from the item box
let json = this._masterItem.toJSON();
// Exclude certain properties that are empty in the cloned object
const { relations: _r, collections: _c, tags: _t, ...keep } = itembox.item.toJSON();
Object.assign(json, keep);
this._masterItem.fromJSON(json);
// Finally call the core merge function
Zotero.Items.merge(this._masterItem, this._otherItems);
}
- The plugin's merger.ts follows a similar pattern before calling Zotero's merge:
Made changes.
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[],
): Promise<any> {
// ...existing code...
const masterJSON = masterItem.toJSON();
const candidateJSON = otherItems.reduce((acc, obj) => ({ ...acc, ...obj.toJSON() }), {});
const { relations, collections, tags, ...keep } = candidateJSON;
masterItem.fromJSON({ ...keep, ...masterJSON });
// Call Zotero's core merge function
return await Zotero.Items.merge(masterItem, otherItems);
}
So the plugin is leveraging Zotero's built-in merge functionality to handle the core merging operations, but adds its own pre-processing layer to:
- Clear the collection tree cache
- Filter items by type
- Combine JSON data from all items
- Apply merged data to the master item
- Then let Zotero handle the final merge including attachments, notes, tags etc.
This ensures compatibility with Zotero's internal data structures while adding additional merge preparation logic.
Made changes.
Note
mimansajaiswal: where is the preferences pane of the zoplicate extension defined?
GitHub Copilot: Based on the code, the preferences pane is defined in a few places:
- First, it's registered in preferenceScript.ts:
export function registerPrefs() {
Zotero.PreferencePanes.register({
pluginID: config.addonID,
src: rootURI + "chrome/content/preferences.xhtml", // Points to the UI definition
label: getString("prefs-title"),
image: `chrome://${config.addonRef}/content/icons/preficon.svg`,
stylesheets:[`chrome://${config.addonRef}/content/prefs.css`],
helpURL: homepage,
});
}
- The default preferences are set in prefs.js:
Made changes.
pref("__prefsPrefix__.duplicate.default.action", "ask");
pref("__prefsPrefix__.bulk.master.item", "oldest");
pref("__prefsPrefix__.duplicate.stats.enable", true);
- The preferences are accessed via enums defined in prefs.ts:
Made changes.
export enum Action {
KEEP = "keep",
DISCARD = "discard",
CANCEL = "cancel",
ASK = "ask",
}
export enum MasterItem {
OLDEST = "oldest",
NEWEST = "newest",
// ...existing code...
}
- The actual UI definition should be in
preferences.xhtml
in the chrome/content directory (not shown in the provided files), which would define:
- The preferences pane layout
- Radio buttons for duplicate actions (keep new/old/all/ask)
- Radio buttons for master item selection (oldest/newest/modified/detailed)
- Checkbox for showing duplicate statistics
- Any other preferences UI elements
The preference system uses:
config.prefsPrefix
from package.json for the preferences namespacegetPref()
andsetPref()
utility functions to read/write preferences- String localizations from the locale files for labels
- CSS styling from prefs.css
So while we don't see the actual UI definition file, the preferences system is set up through these various components working together.
Let me know if you would like me to explain any specific part in more detail!
Made changes.
Note
mimansajaiswal: this is an older plugin, how does this handle type mismatch? the problem is that it no longer works with newer zotero:
// Startup -- load Zotero and constants if (typeof Zotero === 'undefined') { Zotero = {}; } Zotero.DuplicatesMerger = {};
// Preference managers
function getPref(pref) { return Zotero.Prefs.get('extensions.duplicatesmerger.' + pref, true); }
function setPref(pref, value) { return Zotero.Prefs.set('extensions.duplicatesmerger.' + pref, value, true); }
// Startup - initialize plugin Zotero.DuplicatesMerger.loadURI = function(uri){ ZoteroPane_Local.loadURI(uri); };
Zotero.DuplicatesMerger.init = function() {
this._ignoreFields = ['dateAdded', 'dateModified', 'accessDate'];
this.noMismatchedItemsSkipped = 0;
this.noSkippedItems = 0;
this.lastProcessedItemId = 0;
this.currentRowCount = 0;
this.isRunning = false;
this.elapsedTimeSinceLastAction = 0;
this.initialNoItems = 1;
this.selectedItemsList = [];
this.selectedItemsIds = [];
this.mismatchedIds = [];
this.current_state = "idle";
this.stringsBundle = Components.classes['@mozilla.org/intl/stringbundle;1']
.getService(Components.interfaces.nsIStringBundleService)
.createBundle('chrome://zoteroduplicatesmerger/locale/duplicatesmerger.properties');
// Switch to duplicates pane if selected
//this.RestartDuplicatesMerge();
};
getCurrentTime = function(){ return Zotero.Date.dateToSQL(new Date(Date.now()), true); };
Zotero.DuplicatesMerger.getFormattedString = function(stringName, params) {
try {
if (params !== undefined){
if (typeof params != 'object'){
params = [params];
}
return this.stringsBundle.formatStringFromName(stringName, params, params.length);
}
else {
return this.stringsBundle.GetStringFromName(stringName);
}
}
catch (e){
throw ('Localized string not available for ' + stringName);
}
};
// Controls for Tools menu
// *********** Set the checkbox checks, frompref Zotero.DuplicatesMerger.setCheck = function(type) { var pref = getPref(type); if (type == 'master'){ var tools_oldest = document.getElementById("menu_Tools-duplicatesmerger-menu-popup-master-oldest"); var tools_newest = document.getElementById("menu_Tools-duplicatesmerger-menu-popup-master-newest"); var tools_creator = document.getElementById("menu_Tools-duplicatesmerger-menu-popup-master-creator"); tools_oldest.setAttribute("checked", Boolean(pref === "oldest")); tools_newest.setAttribute("checked", Boolean(pref === "newest")); tools_creator.setAttribute("checked", Boolean(pref === "creator")); } else if (type == 'typemismatch'){ var tools_skip = document.getElementById("menu_Tools-duplicatesmerger-menu-popup-typemismatch-skip"); var tools_master = document.getElementById("menu_Tools-duplicatesmerger-menu-popup-typemismatch-master"); tools_skip.setAttribute("checked", Boolean(pref === "skip")); tools_master.setAttribute("checked", Boolean(pref === "master")); } };
// *********** Change the checkbox, topref Zotero.DuplicatesMerger.changePref = function changePref(pref, option) { setPref(pref, option); };
/**
- Open preference window */ Zotero.DuplicatesMerger.openPreferenceWindow = function(paneID, action) { var io = {pane: paneID, action: action}; window.openDialog('chrome://zoteroduplicatesmerger/content/options.xul', 'duplicatesmerger-pref', 'chrome,titlebar,toolbar,centerscreen' + Zotero.Prefs.get('browser.preferences.instantApply', true) ? 'dialog=no' : 'modal', io ); };
/**
- Context menu */
Zotero.DuplicatesMerger.showItemsPopup = function() { var win = Services.wm.getMostRecentWindow('navigator:browser'); var isDuplicatesPane = Zotero.getActiveZoteroPane().getCollectionTreeRow().isDuplicates(); win.ZoteroPane.document.getElementById('duplicatesmerger-itemmenu-bulk').setAttribute('hidden', !isDuplicatesPane); win.ZoteroPane.document.getElementById('duplicatesmerger-itemmenu-single').setAttribute('hidden', false); }.bind(Zotero.DuplicatesMerger);
/**
- Initializes a new progress window
*/ Zotero.DuplicatesMerger.createProgressWindow = function(){ // If there is already a window close it if (this.progressWindow) { this.progressWindow.close(); }
// Create a new window and initialize it
var iconHeadline = 'chrome://zotero/skin/treesource-duplicates' + (Zotero.hiDPI ? "@2x" : "") + '.png';
var icon = "chrome://zotero/skin/plus.png";
this.progressWindow = new Zotero.ProgressWindow({closeOnClick:false});
this.progressWindow.changeHeadline(this.getFormattedString("general.progressHeaderInitial"), iconHeadline);
this.progressWindow.progress = new this.progressWindow.ItemProgress(icon);
this.progressWindow.progress.setProgress(100);
this.progressWindow.progress.setText(this.getFormattedString("general.progressMsgInitial", [this.initialNoItems]));
this.progressWindow.show();
};
/**
- Update the progress window based on the number of items processed */ Zotero.DuplicatesMerger.updateProgressWindow = function () { var processed = this.initialNoItems - this.currentRowCount + this.noMismatchedItemsSkipped; var percent = Math.round((processed/this.initialNoItems)*100); this.progressWindow.progress.setProgress(percent); this.progressWindow.progress.setText(this.getFormattedString("general.itemsProcessed", [processed, this.initialNoItems, this.currentRowCount - this.noMismatchedItemsSkipped])); this.progressWindow.show(); };
Zotero.DuplicatesMerger.closeProgressWindow = function (errorNo, header, msg) { var iconHeadline = 'chrome://zotero/skin/treesource-duplicates' + (Zotero.hiDPI ? "@2x" : "") + '.png'; if (errorNo == 0) { this.progressWindow.changeHeadline(header, iconHeadline); this.progressWindow.progress = new this.progressWindow.ItemProgress("chrome://zotero/skin/cross.png"); this.progressWindow.progress.setProgress(100); this.progressWindow.progress.setText(msg); this.progressWindow.show(); this.progressWindow.startCloseTimer(5000); } else { this.progressWindow.changeHeadline(header, iconHeadline); this.progressWindow.addDescription(msg); this.progressWindow.progress = new this.progressWindow.ItemProgress("chrome://zotero/skin/tick.png"); this.progressWindow.show(); this.progressWindow.startCloseTimer(5000); } };
/**
-
Single item merge */ Zotero.DuplicatesMerger.smartMerge = function() { this.selectedItemsList = Zotero.getActiveZoteroPane().itemsView.getSelectedItems();
var skippreview = getPref("skippreview"); this.mergeSelectedItems(skippreview ); };
function getCreatorName(creatorEntry){ if (creatorEntry.name != null) return creatorEntry.name; return creatorEntry.lastName + " " + creatorEntry.firstName; }
/**
-
Bulk merge */ Zotero.DuplicatesMerger.mergeSelectedItems = async function(performMerge){ DupPane = Zotero.getActiveZoteroPane();
this.current_state = "merge_items";
await DupPane.mergeSelectedItems(); await Zotero.Promise.delay(1);
items = DupPane.getSelectedItems();
// Find the master item var masterSelectionPreference = getPref("master");
items.sort(function (a, b) { return a.dateAdded > b.dateAdded ? 1 : a.dateAdded == b.dateAdded ? 0 : -1; });
if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: attempt items merge");
var masterIndex = 0; if (masterSelectionPreference == "newest"){ masterIndex = items.length - 1; } // Select as master item the one that has the longest first name author else if (masterSelectionPreference == "creator"){ // Retrieve the possible alternatives for each property var item = items[0]; var _otherItems = items.concat(); var alternatives = item.multiDiff(_otherItems, this._ignoreFields);
// If there are alternatives names for the creators if (alternatives.creators != null){ // find the length of the first creator for the first entry var longestCreatorsNameLength = 0; var firstItemValues = item.toJSON(); for (let creator of firstItemValues.creators){ if (creator.creatorType != "author") continue; longestCreatorsNameLength = getCreatorName(creator).length; break; } // go over each item and find if there's a first creator with a longer name for (var i = 1 ; i < _otherItems.length ; i++){ var alternativeItemValues = _otherItems[i].toJSON(); if (alternativeItemValues.creators.length == 0) continue; for (let creator of alternativeItemValues.creators){ if (creator.creatorType != "author") continue; var alternativeNameLength = getCreatorName(creator).length; if (alternativeNameLength > longestCreatorsNameLength){ longestCreatorsNameLength = alternativeNameLength; masterIndex = i; } break; } } }
}
// Select the master item this.current_state = "merge_items:select_master"; if (masterIndex > 0){ var dateList = document.getElementById('zotero-duplicates-merge-original-date'); dateList.selectedIndex = masterIndex;
Zotero_Duplicates_Pane.setMaster(masterIndex);
}
// Handle type mismatching between items var masterTypeId = items[masterIndex].itemTypeID; for (let item of items) { // if a type mismatch was found then handle it if (masterTypeId != item.itemTypeID){
var typemismatchPreference = getPref("typemismatch"); if (typemismatchPreference == "skip"){
this.current_state = "idle"; return false; } else if (typemismatchPreference == "master"){ for (let item of items) { if (masterTypeId != item.itemTypeID){ item.setType(masterTypeId); } } await Zotero.Promise.delay(200); } break; } }/// Merge Items this.current_state = "merge_items:handle_alternatives"; var masterItem = items[masterIndex]; var _otherItems = items.concat();
// Add master item's values to the beginning of each set of // alternative values so that they're still available if the item box // modifies the item var alternatives = masterItem.multiDiff(_otherItems, this._ignoreFields); if (alternatives) { let itemValues = masterItem.toJSON(); for (let i in alternatives) { alternatives[i].unshift(itemValues[i] !== undefined ? itemValues[i] : ''); }
var itembox = document.getElementById('zotero-duplicates-merge-item-box'); for (let param in alternatives){ if (param == "creators" || param == "tags" || param == "relations" || param == "collections") continue; var masterEntryIndex = 0; for (let entry in alternatives[param]){ if (alternatives[param][entry].length > alternatives[param][masterEntryIndex].length){ masterEntryIndex = entry; } } if (masterEntryIndex > 0){ itembox.item.setField(param, alternatives[param][masterEntryIndex]); } } itembox.refresh();
}
this.current_state = "merge_items:merging"; if (performMerge == true){ await Zotero_Duplicates_Pane.merge(); } this.current_state = "idle";
return true; };
/**
-
Selects the next available set of duplicated items
-
@param {ZoteroPane} pane
-
@return {Integer} count of selected items
-
@return {Interger[]} */ Zotero.DuplicatesMerger.getNextDuplicatedItems = async function (pane){ if (typeof pane == 'undefined' || this.selectedItemsList.length > 0) return false;
this.current_state = "get_next_items"; var newSelectedItems = pane.getSelectedItems(); var totalWaitTime = 0; while(this.isRunning && totalWaitTime < 30000){ var newSelectedItemsIds = []; // try to get the next items that are selected for 30s at most while(this.isRunning && newSelectedItems.length <= 1 && totalWaitTime < 30000){ this.current_state = "get_next_items:waiting_new_items";
await Zotero.Promise.delay(100); newSelectedItems = pane.getSelectedItems(); totalWaitTime = totalWaitTime + 100; } // no new items were selected in time if (newSelectedItems.length <= 1){ if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: get next asking for new selection due to timeout "); // attempt to manually select the next items; let selectionResult = await this.selectNextDuplicatedItems(pane); this.current_state = "idle"; return selectionResult; } else{ for (var item in newSelectedItems) newSelectedItemsIds.push(newSelectedItems[item].id); } // if any items were selected, // check to see if they were type mismatches found earlier var foundMismatched = false; this.current_state = "get_next_items:checking_mismatches"; for(var i = 0, count = this.mismatchedIds.length; i < count; i++){ for (var itemId of newSelectedItemsIds){ if (itemId !== this.mismatchedIds[i]) continue; foundMismatched = true; } } if (foundMismatched){ if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: get next asking for new selection due to mismatch"); // attempt to manually select the next items; let selectionResult = await this.selectNextDuplicatedItems(pane); this.current_state = "idle"; return selectionResult; } else{ this.selectedItemsIds = newSelectedItemsIds; this.lastProcessedItemId = this.selectedItemsIds[0]; this.selectedItemsList = newSelectedItems; if (this.noSkippedItems > 0) this.noSkippedItems = 0; this.current_state = "idle"; return true; }
}
if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: getting next exited without a selection"); this.current_state = "idle"; return false; };
/**
-
Selects the next available set of duplicated items
-
@param {ZoteroPane} pane
-
@return {Integer} count of selected items
-
@return {Interger[]} */ Zotero.DuplicatesMerger.selectNextDuplicatedItems = async function (pane){ if (typeof pane == 'undefined' || this.selectedItemsList.length > 0){ if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: select next exited before starting"); return false; } this.current_state = "select_next_items";
this.noSkippedItems = 0; var nextItemIdx = this.noMismatchedItemsSkipped + this.noSkippedItems; while (this.isRunning && pane.itemsView.rowCount > nextItemIdx){ this.current_state = "select_next_items:looking_for_items"; // find the id of the next available item in the list var itemID = pane.itemsView.getRow(nextItemIdx).ref.id; var foundMismatched = false; for(var i = 0, count = this.mismatchedIds.length; i < count ; i++){ if (itemID !== this.mismatchedIds[i]) continue;
this.mismatchedIds.splice(i, 1); this.noMismatchedItemsSkipped = this.noMismatchedItemsSkipped + 1; nextItemIdx += 1; break; } if (foundMismatched) continue; this.current_state = "select_next_items:validating_new_items"; // get the items that have the same ID as the selected one var newSelectedItemsIds = pane.getCollectionTreeRow().ref.getSetItemsByItemID(itemID); // if item not found, then it was deleted if (newSelectedItemsIds.length == 0){ if (this.noSkippedItems > 0) // reset skipped items and start over this.noSkippedItems = 0; await Zotero.Promise.delay(500); } // if no more than one item exists with the given id (i.e. selected item has no duplicates) else if (newSelectedItemsIds.length == 1){ // add to the current offset so that the non-duplicated item can be skipped next time this.noSkippedItems = this.noSkippedItems + 1; await Zotero.Promise.delay(500); } else{ // if the selected item has duplicates // mark the id of the item this.lastProcessedItemId = itemID; // select all items with that id this.selectedItemsIds = newSelectedItemsIds; pane.itemsView.selectItems(this.selectedItemsIds); // and update the references to the selected items this.selectedItemsList = pane.itemsView.getSelectedItems(); if (this.noSkippedItems > 0) this.noSkippedItems = 0; this.current_state = "idle"; return true; } nextItemIdx = this.noMismatchedItemsSkipped + this.noSkippedItems;
}
this.selectedItemsList.length = 0; this.noSkippedItems = 0;
if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: select next exited without a selection");
this.current_state = "idle"; return false; };
/**
-
Check whether the user is the duplicates pane */ Zotero.DuplicatesMerger.checkFocusAsync = async function (){ while(this.isRunning){ await Zotero.Promise.delay(1000); this.elapsedTimeSinceLastAction += 1000;
this.isRunning = this.isRunning && Zotero.getActiveZoteroPane().getCollectionTreeRow().isDuplicates() && this.elapsedTimeSinceLastAction < 120000;
} if (this.elapsedTimeSinceLastAction >= 120000) Zotero.logError("(" + getCurrentTime() + ") DuplicatesMerger timed out");
if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: check focus exited"); };
Zotero.DuplicatesMerger.RestartDuplicatesMerge = async function () { var win = Services.wm.getMostRecentWindow('navigator:browser');
var pane = Zotero.getActiveZoteroPane();
let libraryId = 1;// pane.getSelectedLibraryID();
win.ZoteroPane.setVirtual(libraryId, 'duplicates', true, true);
pane = Zotero.getActiveZoteroPane();
var elapsedTime = 0;
while (pane.itemsView.rowCount == 0 && elapsedTime < 60000)
{
await Zotero.Promise.delay(2000);
elapsedTime += 2000;
}
if (elapsedTime < 60000){
this.mergeDuplicates();
toJavaScriptConsole();
}
};
/**
-
Main plugin function: Merge duplicate items */ Zotero.DuplicatesMerger.mergeDuplicates = async function () { // Prevent the merger to run again, if it's already running if(this.isRunning) { Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: Merger is already running"); return; }
// Keep reference of the duplicates pane var DuplicatesPane = Zotero.getActiveZoteroPane(); if (!DuplicatesPane.getCollectionTreeRow().isDuplicates()){ Zotero.logError("(" + getCurrentTime() + ") DuplicatesMerger: Pane is not duplicates"); return; } this.current_state = "merge_duplicates";
// Notify start of the duplicate merger this.isRunning = true;
var delayBetweenCalls = getPref("delay"); this.showDebug = getPref("showdebug");
// Intialize progress / acitve item counters
this.noMismatchedItemsSkipped = 0; this.noSkippedItems = 0;
this.lastProcessedItemId = 0;
this.selectedItemsList = []; this.selectedItemsIds = []; this.mismatchedIds = [];
this.initialNoItems = DuplicatesPane.itemsView.rowCount; this.currentRowCount = DuplicatesPane.itemsView.rowCount;
await DuplicatesPane.getCollectionTreeRow().ref.getSearchObject();
// Create Progress Windows this.createProgressWindow();
// Retrieve the first items from the list await this.selectNextDuplicatedItems(DuplicatesPane);
// var errorCount = 0; this.errorCount = 0; this.elapsedTimeSinceLastAction = 0;
this.checkFocusAsync();
// Go while the duplicates pane is still the selected pane, // there are still items to be processed, and plugin hasn't timed out while (this.isRunning && this.currentRowCount > (this.noMismatchedItemsSkipped+1) && this.errorCount <= 5) { this.current_state = "merge_duplicates:loop"; try{ // If there is a set of items selected if (this.selectedItemsList.length > 1){ try{ // Try to merge them this.current_state = "merge_duplicates:attempt_merge"; let mergeResult = await this.mergeSelectedItems(true); if (mergeResult == true) { this.current_state = "merge_duplicates:successful_merge";
var newSelectedItems = DuplicatesPane.getSelectedItems(); var newSelectedItemId = newSelectedItems[0].id; if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: successfully merged item " + this.lastProcessedItemId); // merge succesfully completed var currentActionElapsedTime = 0; while (this.isRunning && newSelectedItemId == this.lastProcessedItemId && currentActionElapsedTime < 20000){ this.current_state = "merge_duplicates:waiting_item_removal"; await Zotero.Promise.delay(500); currentActionElapsedTime += 500; newSelectedItems = DuplicatesPane.getSelectedItems(); newSelectedItemId = newSelectedItems[0].id; } if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: finished waiting for new id (" + this.lastProcessedItemId + " - " + newSelectedItemId + "), wait time: " + currentActionElapsedTime); if (currentActionElapsedTime < 20000) this.elapsedTimeSinceLastAction = 0; this.updateProgressWindow(); } else{ this.current_state = "merge_duplicates:found_mismatch"; if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: found type mismatch - skipping"); for (let newId of this.selectedItemsIds){ this.mismatchedIds.splice(0, 0, newId); } this.elapsedTimeSinceLastAction = 0; this.updateProgressWindow(); } this.errorCount = 0; }catch(e){ this.current_state = "merge_duplicates:merging_error"; if (this.showDebug){ Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: DuplicatesMerger is having some issues"); Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: Error while merging of items"); } this.errorCount = this.errorCount + 1; await Zotero.Promise.delay(2000); if (this.errorCount > 5){ // An error occured! Stop running and close notification window this.isRunning = false; Zotero.logError("(" + getCurrentTime() + ") DuplicatesMerger is stopping"); this.closeProgressWindow(0, this.getFormattedString("general.errorHasOccurredHeader"),this.getFormattedString("general.errorHasOccurredMsg")); await Zotero.Promise.delay(2000); break; } } finally{ this.selectedItemsList.length = 0; this.noSkippedItems = 0; } } // Wait for a bit and then select the next set of items await Zotero.Promise.delay(delayBetweenCalls); await this.getNextDuplicatedItems(DuplicatesPane); } catch (e) { this.current_state = "merge_duplicates:identification_error"; if (this.showDebug){ Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: DuplicatesMerger is having some issues"); Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: Error while retrieving items to merge"); } this.selectedItemsList.length = 0; this.noSkippedItems = 0; this.errorCount = this.errorCount + 1; if (this.errorCount > 5){ Zotero.logError("(" + getCurrentTime() + ") DuplicatesMerger is stopping due to errors merging"); // An error occured! Stop running and close notification window this.isRunning = false; this.closeProgressWindow(0, this.getFormattedString("general.errorHasOccurredHeader"),this.getFormattedString("general.errorHasOccurredMsg")); await Zotero.Promise.delay(2000); break; } await Zotero.Promise.delay(2000); } this.currentRowCount = DuplicatesPane.itemsView.rowCount;
}
await Zotero.Promise.delay(delayBetweenCalls);
var processed = this.initialNoItems - this.currentRowCount + this.noMismatchedItemsSkipped; if (this.currentRowCount == this.noMismatchedItemsSkipped){ this.closeProgressWindow(1, this.getFormattedString("general.progressCompletedHeader"), this.getFormattedString("general.progressCompleteMsg", [processed])); } else{ this.closeProgressWindow(0, this.getFormattedString("general.progressInterrupterHeader"), this.getFormattedString("general.progressInterrupterMsg", [processed])); }
this.isRunning = false;
if (this.showDebug) Zotero.log("(" + getCurrentTime() + ") DuplicatesMerger: exited with " + processed + " items processed!");
this.noMismatchedItemsSkipped = 0; this.noSkippedItems = 0;
this.lastProcessedItemId = 0; this.currentRowCount = 0;
this.initialNoItems = 0;
this.elapsedTimeSinceLastAction = 0;
this.selectedItemsList = []; this.selectedItemsIds = []; this.mismatchedIds = [];
this.current_state = "idle";
await Zotero.Promise.delay(5000);
if (this.isRunning) return;
this.progressWindow = null;
delete this.progressWindow; };
if (typeof window !== 'undefined') { window.addEventListener('load', function(e) { Zotero.DuplicatesMerger.init(); }, false); }
GitHub Copilot: Looking at this older plugin code compared to the current Zoplicate implementation, there are several key differences in how type mismatches are handled:
- The old plugin had explicit type mismatch handling with user preferences:
// Old plugin approach
// Let user choose how to handle type mismatches
var typemismatchPreference = getPref("typemismatch");
if (typemismatchPreference == "skip") {
// Skip items with different types
return false;
}
else if (typemismatchPreference == "master") {
// Force all items to master's type
for (let item of items) {
if (masterTypeId != item.itemTypeID){
item.setType(masterTypeId);
}
}
}
- The current Zoplicate plugin simply filters out mismatched types:
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[],
): Promise<any> {
// Simply filter out items of different types
const masterItemType = masterItem.itemTypeID;
otherItems = otherItems.filter((item) => item.itemTypeID === masterItemType);
if (otherItems.length === 0) {
return;
}
// ...existing code...
}
To update Zoplicate to handle type mismatches more gracefully like the old plugin, we could:
- Add a new preference:
Made changes.
pref("__prefsPrefix__.duplicate.type.mismatch", "skip"); // Options: skip, master, ask
- Update the merger function:
Made changes.
export async function merge(
masterItem: Zotero.Item,
otherItems: Zotero.Item[],
): Promise<any> {
Zotero.CollectionTreeCache.clear();
const masterItemType = masterItem.itemTypeID;
const mismatchedItems = otherItems.filter(item => item.itemTypeID !== masterItemType);
if (mismatchedItems.length > 0) {
const typeMismatchPref = getPref("duplicate.type.mismatch");
if (typeMismatchPref === "skip") {
otherItems = otherItems.filter(item => item.itemTypeID === masterItemType);
if (otherItems.length === 0) return;
}
else if (typeMismatchPref === "master") {
// Convert mismatched items to master type
for (const item of mismatchedItems) {
await item.setType(masterItemType);
}
}
else if (typeMismatchPref === "ask") {
const result = await new ztoolkit.Dialog()
.setTitle(getString("type-mismatch-title"))
.setMessage(getString("type-mismatch-message"))
.addButton(getString("type-mismatch-skip"))
.addButton(getString("type-mismatch-convert"))
.show();
if (result === 0) {
otherItems = otherItems.filter(item => item.itemTypeID === masterItemType);
if (otherItems.length === 0) return;
} else {
for (const item of mismatchedItems) {
await item.setType(masterItemType);
}
}
}
}
// ...rest of existing merge code...
}
This solution provides more flexibility in handling type mismatches while maintaining compatibility with newer Zotero versions. Let me know if you would like me to explain any part in more detail!