Managing Duplicates in Your Bundle

Managing duplicates can be a time-consuming challenge to overcome in the run up to a hearing. Duplicates can exist in a dataset for many reasons, such as collecting the same document from multiple sources, collecting multiple versions of the same document (near-duplicates) or non-inclusive emails.

The Commercial Court Guide clearly states “no more than one copy of any one document should be included, unless there is good reason for doing otherwise”, meaning that legal teams have a responsibility to ensure that unnecessary duplicative documents are removed from the bundle.

Deduplication is a process by which exact duplicates are eliminated from the dataset, and this can usually be carried out by your chosen software provider, such as your eDiscovery platform, ebundling or trial presentation software. Most deduplication tools use the assigned hash value to compare against other documents. Each document will have its own assigned hash value, which is a unique identifier used to distinguish documents from one another. If two documents share the same hash value, they would be considered exact duplicates of one another, and thus all but one would be eliminated.

On the other hand, near-duplicates are often handled differently within the context of an investigation or trial. It may be relevant to the matter that, for example, multiple versions of a contract existed, or the fact that someone relevant to the proceedings forwarded a confidential email onwards. These near-duplicates may need to be included in the bundle along with their counterparts for points of comparison. In the case of near-duplicates, using hash value deduplication may not be the best tool. Instead, your provider may be able to help you with near-duplicate analysis which can show similarities between documents from 95%+ similarity. This means that documents with slight changes to metadata or text would be highlighted for human review to decide on whether to include them in the trial bundle.

TrialView can support you with managing duplicates as quickly and seamlessly as possible. Our workspace assigns a unique ID to each individual document to ensure it’s easy to identify where a document exists across different locations. This means that if a document exists in two separate bundles, both link to the same underlying document, which is imperative for bundle and version-management.

For more information, get in touch at

Advantage AI: The litigation game is changing

How can we better use AI tools in the world of disputes?

With automation on the rise across the legal profession, it is easy to forget how impactful AI could be on the disputes sector.
Eimear McCann explores the true value of engaging AI tools in litigation, in a guest post for Legal Futures.

The need to trawl through paper files to find a specific document or file has been replaced by a simple search on a digital platform. The capacity to really interrogate the evidence, across an entire data set is incredibly powerful, allowing users to ask questions and retrieve specific answers within seconds.

The insights and patterns that can take hours to identify on paper can be automated and enhanced, with advanced bundling tools and automated transcription expediting hearing and witness preparation.

The inexorable rise of AI-driven technology will gradually lead to less human intervention and a greater use of automated, intelligent processes.

Read more here.

NLJ: The Environmental Cost of Dispute Resolution

Eimear McCann and Michael Wilkinson question whether we need a profession-wide approach to reduce the environmental impact of litigation in the latest edition of the New Law Journal.

With increasing pressure to meet ESG commitments, should the courts push for more sustainable litigation, by leveraging technology; and do we need rule reform to see a complete shift in mindset and practice?

Incredibly, according to the Campaign for Greener Arbitrations, the average international arbitration takes nearly as many as 20,000 trees to offset (although, as offsetting is itself deeply problematic, it is always better to reduce emissions in the first place).

If the environmental reasons don’t change behaviour, however, then client-driven imperatives might. Wilkinson and McCann write: ‘Increasingly, corporate clients are operating within an environmental, social and governance (ESG) framework and are beholden to their stakeholders. They may have contractual commitments to endeavour to reduce their emissions; their funding may even have been subject to such commitments. Increasingly, regulations require companies to report on their carbon emissions and transition plans, and shareholders may call for more environmentally responsible behaviour.

Read more here

The impact of AI on Disputes

Speaking to CDR for their piece on the impact of AI on the disputes world, our CEO,  Stephen Dowling, describes ChatGPT as a “game-changer” for the profession: “It combines cognitive and semantic search, familiarisation tools, and generative technologies to generate content – all these functionalities are unlocked by natural language processing”.

With the potential to completely revolutionise dispute resolution, a pragmatic approach is recommended, “[stakeholders] need to be wary of ensuring the material they are relying on and [the output] generated is constantly sense-checked and verified by human actors – that sense-check could also be augmented by other technology…but the critical thing is not to be misled, and ensure that factual inaccuracies are no greater than those in the manual world.”

Read more here.