Factify wants to move beyond PDFs and .docx by giving digital documents their own brain



Tel Aviv-based startup Factify came out of stealth today with a round of funding of 73 million dollars for an ambitious, but quixotic, mission: to take digital documents beyond the standard formats used by most companies – .PDF, .docx, collaborative cloud files like Google Docs – and enter the era of intelligence.

For Matan Gavish, founder and CEO of Factify, it’s not just a software upgrade: it’s an inevitability that has obsessed him for years.

"The PDF was developed when I was in elementary school," Gavish told VentureBeat. "The foundation of the software ecosystem hasn’t really evolved… someone needs to rethink the digital document itself."

Gavish, a tenured professor of computer science and PhD at Stanford, admits that his fixation on administrative file formats is an anomaly for someone with his qualifications.

"It’s a really uncool problem to be obsessed," he said. "Since my college background is in AI and machine learning, my mom wanted me to start an AI company because it’s cool. I don’t know why I’m obsessed and then possessed by documents."

But that obsession has now attracted a major funding round led by Valley Capital Partners and backed by AI heavyweights like former Google AI chief John Giannandrea.

The bet is simple: the static rigidity of most digital files has limited their usefulness, and a better, smarter document that actually shares its edit history and ownership with users as intended is not only possible, it’s a multibillion-dollar opportunity.

The history of digital documents

To understand why a funding round would reach $73 million, we must understand the scale of the trap in which companies find themselves. It is estimated that there are currently around three trillion PDFs in circulation. "Some people see the PDF more than their children," Gavistic jokes.

The history of the digital document is not a linear progression where one format replaces another. It is rather a story of "speciation," where different formats have evolved to fill distinct ecological niches: creation, distribution and collaboration.

The era of files: Microsoft Word (1980s-1990s)

Digital documents began as isolated artifacts. In the 1980s, the "document" was inextricably linked to the material that created it. A file created in WordPerfect on a DOS machine was effectively gibberish to a Macintosh user.

Microsoft Word, whose lineage dates back to Xerox PARC’s pioneering WYSIWYG editors, was a game-changer by leveraging the dominance of the Windows operating system. In the 1990s, the binary .doc format became the default container for editable business documents. However, these files were structurally complex "memory dumps" designed for the limited hardware of the time, often leading to corruptions or privacy leaks where deleted text remained hidden within the file’s binary data.

The era of digital “stone”: PDF (years 1990-2006)

PDF was not born as a writing tool; it was a visualization tool. In 1991, Adobe co-founder John Warnock wrote the "Camelot Project" white paper, considering a "digital envelope" it would look the same on any screen or printer.

Unlike Word files, which were malleable, PDFs were designed to be immutable. They used the PostScript imaging model to place characters at precise coordinates, ensuring visual fidelity. Although adoption was initially slow, Adobe’s decision in 1994 to release Acrobat Reader for free established PDF as the global standard for "digital concrete"—the final character format used for contracts, government forms, and records.

The era of collaborative cloud documents (2006 to present)

In 2006, Google disrupted the model again by moving the document from the hard drive to the browser. Using "Operational transformation" algorithms, Google Docs allowed multiple users to simultaneously edit the same stream of text.

This changed the paradigm of "send a file" has "share a link." With Google Workspace now claiming more than 3 billion users (primarily consumers and educational institutions), it has fundamentally changed the way we work, transforming documents into living, collaborative processes rather than static artifacts.

The status quo: fragmentation

Despite these advances, the business world remains fragmented. We write in Google Docs (the "Digital flow"), format in Word (the "Digital clay"), and connect in PDF (the "Digital Stone").

But this fragmentation comes at a cost. "The problem is not the document. It’s all around him," notes the company. "Once a PDF leaves your system, control disappears. The versions vary. Access is not clear. Nothing is visible."

Transforming digital documents into intelligent infrastructure

Factify’s bet is that in the age of AI, this fragmentation is no longer just annoying: it is a critical failure. AI models need structured and verifiable data to work.

When an AI "bed" a PDF, it’s essentially guessing, using optical character recognition to extract text from what is actually a digital photo.

"We are dealing here with a megalomaniacal vision, but at the same time it is probably something inevitable," Gavish said.

Factify’s solution is to treat documents not as static files, but as intelligent infrastructure. In the "Factified" standard, a document carries its own brain. It has a unique identity, a live authorization system, and an immutable audit log that goes with it.

"We have written a new document format that supplants PostScript," Gavish explains. "We’ve created a new data layer that supports the document as a first-class citizen…and it’s still available inside the organization and potentially outside."

This distinction – between a File and an API – is at the heart of the company’s discourse."

  • Files are liabilities: They accumulate, get lost and can be stolen. "This returns to brick status," Gavish said. "Files are more of a liability because they accumulate there and you have to keep them."

  • APIs are assets: A Factify document is an active object. You can ask him questions: "Who saw you? When do you expire? Are you the most up to date version?"

“People don’t change”, but formats do

History is littered with formats that have attempted to replace PDF (like Microsoft’s XPS). They failed because they required too many behavioral changes from users. Gavish is well aware of this trap.

"When I talk to enterprise software entrepreneurs, I tell them that the two laws you need to know about starting an enterprise software business are that people don’t care and no one changes." he said.

To get around this, Factify has built deep backwards compatibility. A Factified document can look exactly like a PDF, with page breaks and margins. Users don’t need to learn a new interface to get value; they just need to solve a specific problem, like an executive who wants to make sure an investment note can’t be passed on.

"All they have to say to their team is: “Dear Chief of Staff, employment contracts and investment memorandums… are going to be factified. The rest continues,’" Gavish said. "They see an immediate benefit…but then discover they’ve crossed the Rubicon."

What’s next for Factify?

The capital from this round will be used to further the core engineering of the platform, which Gavish describes as a "heavy technical elevator" requiring them to rebuild the document format, data layer and application layer from scratch. The company is also establishing a major operations center in Pittsburgh to support its expansion in the United States.

Ultimately, Factify isn’t trying to create another collaboration tool like Google Docs. They are trying to build the immutable register of the future – the standard for "truth" in a digital world.

"PDF… has become a standard, which means I cannot file my taxes using any other format. This is what victory looks like," Gavish said. "We are creating a document standard that is not specific to healthcare or insurance, but is simply a document in its own right."

For the three trillion static files currently stored in the cloud, the writing may finally be on the wall.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *