AI Tip: Don’t Convert Google Doc to RTF

Mar 19, 2024 | General

Blog

Google Docs blew up our RTF

In our blog, we cover general AI training tips that we hope will be helpful for those that are training AI models, especially ones using LlamaIndex like Vizaport. This is one that surprised us considering how common Google Docs is used and an export feature that we would have also thought was common.  

Until Google fixes the issue, do not convert Google Docs to RTF.

Why? At first, our Google Doc converted to RTF seems harmless.  We didn’t think there was an issue until we uploaded the RTF file to our AI training model. It’s worth noting that this would have been an issue with any embedding model as you will see.  

Let’s first take a look at what the file looks like, using lorem ipsum text, this image of the RTF looks normal.  

Buy vs AI

What’s the issue?  

The RTF file looks normal when reading it.  But under the covers, you see a very different story. AI models depend on reading text, and the text is not readable.  Let’s take a closer look…

The file is unreadable for an AI. Notice the first two words, “Lorem ipsum” are now buried within useless characters.  

“.…\cf2 Lorem}{\r….” and “\ulnone\cf2 ipsum}{\rtlch\” and there is a lot of additional characters in between.

This made it unrecognizable to our AI. It didn’t understand anything that we loaded. That was the first clue. The second was the size of the file and how much data it entered into our vector database. It blew up the file to be megabytes when it should have been a few kilobytes.

Buy vs AI

Lesson Learned

We won’t do that again.  And we hope that those reading this blog will also find an alternative solution to getting this data from Google Docs (such as convert to .doc format if your AI supports it).  And we also hope that Google fixes this issue soon, because RTF is a very helpful format for loading data into Vizaport.

Recent Articles

Our new product line

Our new product line

Introducing Vizaport's New Suite of AI-Powered Data Visualization Tools In today’s rapidly evolving technological landscape, data visualization is not just a tool but a necessity for businesses across various sectors. At Vizaport, we are proud to unveil our redesigned...

Network Visualizer – Configuration Options

Network Visualizer – Configuration Options

New Options for Internet Service Providers In Vizaport's latest release, we've added new configuration options for our Network Visualizer, used by Internet Service Providers to allow customers to view proximity to fiber and cable lines. We'll highlight all of the...

Vizaport AI Chatbot – Now With Google Gemini

Vizaport AI Chatbot – Now With Google Gemini

From Zero to Chatbot in 3 minutes…that’s all it takes with Vizaport’s latest WordPress Plugin update. Version 1.2 now comes with an option to choose Google Gemini. You still have the option to use OpenAI, but choosing Gemini bypasses the requirement of obtaining a...