Columns

Old websites seldom die: using the Wayback Machine in litigation

 

by Nathan Peplinski   |   Michigan Bar Journal

“Once something is on the internet, it will never go away.”

While this adage can be true, finding an actual record of an inactive web page or website can be frustrating. In litigation, ascertaining precisely what content was posted and when it appeared can be crucial.

Often, litigants who control a website scrub information detrimental to their case. Also, website marketing is a dynamic, ever-changing practice involving constant text and keyword modification to improve traffic. Even in the absence of malice, something seen on a website one day may disappear the next, which can aggravate counsel or opposing parties who did not secure proof of the content when it was live.

The Wayback Machine can be a window to the past, at least figuratively.

WHAT IS THE WAYBACK MACHINE?

The Internet Archive is a 501(c)(3) non-profit entity founded to gather knowledge in a digital format and make it available to everyone in the world in perpetuity.1 In 1996, the organization began inventorying the web — at least commercially — using the Wayback Machine, a tool developed to collect books, television, movies, radio materials, concerts, and websites.2

The system seeks to preserve our culture’s digital artifacts and heritage for researchers, historians, and scholars by indexing web page data using a program or automated script.3 Its web-crawling is the same method search engines like Google employ to deliver results.4 The Internet Archive uses the data it collects to create a three-dimensional index for browsing web materials over recorded periods.5

By simply entering a website’s URL, the Wayback Machine lets you view that site’s past iterations, assuming it was archived. You select the version by choosing a date from a timeline of yearly calendars.

WAYBACK MACHINE LIMITATIONS

The Wayback Machine only gathers publicly available information and does not index or archive information on password-protected websites, pages on secured servers, or online message boards.6 A site owner can also request that it disregard their website by establishing robot exclusions.7 Despite these restrictions, the amount of information the Wayback Machine continues to collect is astounding; the Internet Archive indicates it has archived over 806 billion web pages as of April 2023.8

Depending on the target website and search parameters, Wayback Machine users may see grayed-out graphics or no images due to difficulties capturing JavaScript elements or an inability to incorporate graphic material into the archive.9 But even when the complete page is not recorded, the system may still archive relevant information, especially written text.

Another concern is external hyperlinks that take you to a different website. Clicking a link on an archived page will likely take you to another point in time on that site. Though the Wayback Machine tries to deliver third-party pages contemporaneous with the target website, the date of the linked material will have likely changed. The Wayback Machine displays the web page’s date at the top of the browser.10

Users can request the Wayback Machine to capture a single web page via its “Save Page Now” feature. The archive will maintain the record for as long as the website does not block crawlers even if the site owner changes or deletes the page later. These requests are anonymous; the system does not keep the requestor’s IP address.11

USE OF THE WAYBACK MACHINE IN LITIGATION

Because historical information is often vital to prosecuting or defending a lawsuit, data it can be determinative to counsel, judges, and juries. Although the Wayback Machine was not created for this use, the Internet Archive configured its system to make its contents more useful in legal matters. The first step for any lawyer attempting to use the Wayback Machine in litigation should be to avoid involving Internet Archive staff.

Authenticating the Record

As with any form of evidence, authenticating archived Wayback Machine pages is essential. The proffering attorney must authenticate that record to show it is what it purports to be. But as a non-profit organization, the Internet Archive has limited resources and people maintaining the archive would rather devote their efforts to indexing information for posterity than being brought into your dispute. The organization clarifies that attorneys must first address any authentication issues themselves.12

The most streamlined means for authenticating an archived web page is simply asking the opposing party to stipulate that the captured information is a correct and accurate record. Given that a neutral, unbiased source maintains the information, there should be a good basis for counsel to admit that the web page stated what it stated at that time and move on to its impact on the litigation.

A request for admission is another way to enter the information into evidence, which Michigan’s state and federal courts allow.13 “[R]equests for admission are used to establish admission of facts about which there is no real dispute.”14 The Internet Archive’s purpose, operation, and inherent neutrality are tough to discredit; providing the opposing party with the Wayback Machine record attached to a request to confirm the system correctly recorded what was previously in existence should elicit an admission.

Another means of authentication is deposing the party maintaining the relevant website’s content, especially when the opposing party’s website is at issue. You may not know the person with knowledge of the website’s content, but you can likely overcome that by taking what is often called a corporate representative deposition.15

Judicial Notice

Having the court take notice of material obtained from the Wayback Machine is another path to admitting evidence. “Judicial notice is based upon very obvious reasons of convenience and expediency; and the wisdom of dispensing with proof of matters within the common knowledge of everyone has never been questioned.”16

Michigan Rule of Evidence 201(b) states:

“A judicially noticed fact must be one not subject to reasonable dispute in that it is either (1) generally known within the territorial jurisdiction of the trial court or (2) capable of accurate and ready determination by resort to sources whose accuracy cannot reasonably be questioned.”

The state rule is consistent with its federal counterpart, FRE 201.17 Since web crawling takes a snapshot of a web page’s contents at a distinct moment, recording that content should meet the rules’ requirements. The crawler has no decision-making power and cannot alter the impressions it captures; indeed, doing so would defeat the Wayback Machine’s very purpose as a tool to record the history of the web.

The U.S. District Court for the Eastern District of Michigan concluded as much in a dispute over the parties’ web presences and respective uses of a certain phrase and the judge even used the Wayback Machine to obtain relevant information, ruling that “[t] he Court takes judicial notice of the parties’ historical internet presence as represented by the Internet Archive.”18

Numerous other courts have reached similar conclusions regarding the Wayback Machine as the proper vehicle for judicial notice of a web page’s contents.19 Patent examiners have also used it in similar situations.20

But judicial notice of Wayback Machine records is not universally accepted.21 Some courts that have addressed challenges to Wayback Machine materials require further authentication of the evidence before admission.22 Other courts require an “affidavit of a person with personal knowledge who can attest that the third-party crawler operates to create an unaltered copy of a website as it appears on a given day.”23

Conversely, the U.S. District Court for the Northern District of Illinois noted that authentication is a low bar requiring only a prima facie showing of genuineness; whether the opposing party has presented evidence of bias or a lack of unreliability in material obtained from the Wayback Machine is a question for the jury.24 This added authentication requirement seems to defeat much of the purpose of judicial admission and ostensibly creates two unnecessary hurdles.

First, as a practical consideration, because a lawyer cannot be a witness in the case, they would have to find a testifying witness to personally repeat the steps of accessing the Wayback Machine, conducting the search, and printing the relevant material.

Second, the Internet Archive is a non-profit organization with limited resources; it neither charges fees for access or retrieval nor permits advertising, opting to fund its operation primarily through donations.25 Compelling Internet Archive personnel to help a party satisfy an affidavit requirement places an unwanted burden on them. While they will authenticate records if needed, it may take additional time to complete that task.26 One exception to its fee-free structure is a processing charge for each authentication request for each website URL. The Internet Archive’s standard affidavit language, including its statement regarding crawler software, can be reviewed on its website.27

Hearsay Challenges and Exceptions

Some lawyers seeking to block admission of Wayback Machine records have raised hearsay objections. Hearsay can be a complicated issue; exceptions to the general exclusionary rule depend on the purpose for which the evidence is introduced and in what manner, case by case. Even a single document or record can reflect multiple levels of hearsay.

Hearsay is generally defined as an out-of-court statement made for the truth of the matter asserted.28 Arguably, a web page’s record of existence at a specific time fails to meet this definition; some courts have stated that such a record is not a statement at all because it is an image generated by a machine.29 At least one court has also noted that the Internet Archive’s maintained records falls under the business record hearsay exception.30 If the party maintaining the website is a party to the case, it is also possible that the statement recorded on the website would amount to a statement against that party’s interest or an admission by a party opponent.31

Counsel on both sides should carefully explore hearsay objections and exceptions before trying to introduce or block the gathered Wayback Machine information at trial. Yet even when a found piece of evidence may not seem admissible, that does not mean it is useless. If the end goal of litigation is seeking the truth and delivering justice, knowledge of previously existing information can be valuable when advising your client or confronting the other side. Therefore, a thorough investigation and extensive discovery are always necessary to conclusively advocate a matter and the Wayback Machine can be a worthwhile tool in that endeavor.

CONCLUSION

The Wayback Machine is a digital looking glass that enables attorneys searching for earlier versions of websites and web pages to magically peer into the past. Although the records it produces have limitations and you cannot count on the system to capture every line of code, the Wayback Machine may provide the evidence you need to support your client’s claims or defenses.

Like every other tool, however, users must wield it carefully. Counsel should develop a sound strategy to ensure the admissibility of a Wayback Machine-indexed web page so the jury or judge can properly consider it. That requires an in-depth understanding of evidentiary rules related to authentication, hearsay, and admitting a Wayback Machine record during a deposition or by a request for admission.

But even when the opportunity to admit such evidence may be difficult or doubtful, the Wayback Machine can still be an effective way to travel back in time to retrieve a defunct or materially altered web page and prove its existence. As expressly stated in the comments to Michigan Rule of Professional Responsibility 1.1:

“To maintain the requisite knowledge and skill, a lawyer should engage in continuing study and education, including the knowledge and skills regarding existing and developing technology that are reasonably necessary to provide competent representation for the client in a particular matter.”

Familiarizing oneself with innovations like the Wayback Machine isn’t just advisable — it is arguably part of the lawyer’s ethical duty to maintain competence.


“Best Practices” is a regular column of the Michigan Bar Journal, edited by George Strander for the Michigan Bar Journal Committee. To contribute an article, contact Mr. Strander at gstrander@yahoo.com.


ENDNOTES

1. Archive.org Information, Internet Archive <https://help.archive.org/help/archive-org-information/> [https://perma.cc/PP4Y-Y826]. All websites cited in this article were accessed July 26, 2023.

2. Introductory Tour of Archive.Org and its Collections, Internet Archive <https://help.archive.org/help/introductory-tour-of-archive-org-and-its-collections/> [https://perma.cc/B4YS-YV45].

3. Wayback Machine General Information, Internet Archive <https://help.archive.org/help/wayback-machine-general-information/> [https://perma.cc/H52H-6HFR].

4. Dilmegani, Web Crawler: What It Is, How It Works & Applications in 2023, AI Multiple (June 6, 2023) <https://research.aimultiple.com/web-crawler/> [https:// perma.cc/NFF5-UC2Y].

5. Wayback Machine General Information.

6. Id.

7. Robot Exclusion Protocol allows a website to exclude automated clients, such as a web crawler, from accessing the site, in part or completely, Formalizing the Robots Exclusion Protocol Specification, Google Search Central Blog (July 1, 2019) <https:// developers.google.com/search/blog/2019/07/rep-id> [https://perma.cc/BNY3- 7ZFE].

8. Wayback Machine, Internet Archive <https://archive.org/web/> [https://perma.cc/HM68-QEBM].

9. Using the Wayback Machine, Internet Archive <https://help.archive.org/help/ using-the-wayback-machine/> [https://perma.cc/KBS9-Q5F5].

10. E.g., Harvey Kruse, P.C. <https://www.harveykruse.com/attorney-profiles/> [https://perma.cc/BV69-53J4].

11. Save Pages in the Wayback Machine, Internet Archive <https://help.archive.org/help/save-pages-in-the-wayback-machine/> [https://perma.cc/7X9L-Y7PN].

12. Information Requests, Internet Archive <https://archive.org/legal> [https:// perma.cc/9539-44DV].

13. MCR 2.312 and FRCP 36. The Michigan court rule regarding requests for admission is modeled after the federal rule.

14. Lawrence v Burdi, 314 Mich App 203, 213; 886 NW2d 748 (2016).

15. MCR 2.306(B)(1)(b) allows a deposition notice to list “a general description sufficient to identify the person or the particular class or group to which the person belongs” when noticing a deposition.

16. Winekoff v Pospisil, 384 Mich 260, 268; 181 NW2d 897 (1970).

17. MRE 201, Notes.

18. Pond Guy, Inc v Aquascape Designs, Inc, unpublished order of the United States District Court for the Eastern District of Michigan, issued June 24, 2014 (Docket No 13-13229).

19. See Perera v AG United States, 536 F App’x 240, 242 n 3 (CA 3, 2013); UL LLC v Space Chariot, Inc, 250 F Supp 3d 596, 604 n 2 (CD Cal, 2017); In re Packaged Seafood Prod Antitrust Litigation, 338 F Supp 3d 1118, 1132 n 8 (SD Cal, 2018); Safeway Transit Ltd Liability Co v Discount Party Bus, Inc, 334 F Supp 3d 995, 1000 n 4 (D Minn, 2018); East Coast Test Prep LLC v Allnursescom, Inc, 307 F Supp 3d 952, 967 (D Minn, 2018); FTC v Omics Group, 374 F Supp 3d 994, 1004 n 6 (D Nev, 2019); In re Facebook, Inc Securities Litigation, 405 F Supp 3d 809, 829 (ND Cal, 2019); Pohl v MH Sub I, LLC, 332 FRD 713, 716 (ND Fla, 2019); MD v Abbot, 509 F Supp 3d 683, 713 n 19 (SD Tex, 2020); Southern Environmental Law Ctr v Council on Environmental Quality, 446 F Supp 3d 107, 113 n 6 (WD Va, 2020); DotStrategy, Co v Twitter Inc, 476 F Supp 3d 978, 980 n 1 (ND Cal, 2020); Cooper v Simpson Strong-Tie Co, 460 F Supp 3d 894, 905 (ND Cal, 2020); Sturgis Motorcycle Rally, Inc v Rushmore Photo & Gifts, Inc, 529 F Supp 3d 940, 960 (D SD, 2021); Brown v Google LLC, 525 F Supp 3d 1049, 1061 (ND Cal, 2021); Yoon v Lululemon United States, 549 F Supp 3d 1073, 1079 (CD Cal, 2021).

20. Valve Corp v Ironburg Inventions Ltd, 8 F 4th 1364, 1374 (CA Fed, 2021).

21. Weinhoffer v David Shoring, Inc, 23 F 4th 579, 584 (CA 5, 2022) and Ward v American Airlines, 498 F Supp 3d 909, 915 n 1 (ND Tex, 2020).

22. Pohl v MH Sub I, LLC, 332 FRD 713, 716, 717 (ND Fla, 2019).

23. Id., at 718.

24. Telewizja Polska USA, Inc v Echostar Satellite Corp, unpublished memorandum opinion and order of the United States District Court for the Northern District of Illinois, Eastern Division, issued October 15, 2004 (Case No 02 C 3293).

27. A Message from Internet Archive Founder, Brewster Kahle, Internet Archive <https://archive.org/donate?origin=iawww-TopNavDonateButton> [https://perma.cc/225L-WMCU].

26. Information Requests, Internet Archive <https://archive.org/legal> [https:// perma.cc/QV7K-A6VM].

27. Standard Affidavit, Internet Archive <https://archive.org/legal/affidavit.php> [https://perma.cc/CX4V-UR4C].

28. FRE 801(c) and MRE 801(c).

29. McGucken v Triton Electric Vehicle Ltd liability Co, unpublished order of the United States District Court for the Central District of California, issued March 21, 2022 (Case No CV 21-3624-DMG (GJSx)); Hansen Cold Storage Construction v Cold Systems, unpublished order of the United States District Court for the Central District of California, issued February 11, 2022 (Case No 2:19-cv-07617-SB-MAA); Abu-Lughod v Calis, unpublished order of the United States District Court for the Central District of California, issued October 9, 2014 (Docket No CV 13-2792 DMG (RXx)); and Telewizja Polska USA, Inc v Echostar Satellite Corp.

30. Abu-Lughod v Calis.

31. MRE 804(b)(3), FRE 804(b)(3), MRE 801(d)(2), and FRE 801(2)(D).