When Paperwork Takes More Time Than the Actual Construction#
In the life of a construction company, a huge volume of documents is generated: incoming invoices, outgoing invoices, completion certificates, contracts, proposals. The team at Murabau Kft. was processing dozens of such documents per month by hand. Every invoice had to be opened, the data read from it, and then manually entered into a spreadsheet.
This consumed 8-12 work hours per month, and it came with a risk of errors. Mistyped invoice numbers, wrong amounts, missing tax IDs. If you work in construction, you know the situation: instead of doing actual work, you spend your time on administration.
Murabau's management finally decided there had to be a better way.
What Is AI OCR, and Why Is It Different from a Traditional Scanner?#
Before we get to the solution, it's worth clarifying what OCR (Optical Character Recognition) means and why it's only now become truly practical technology.
Traditional OCR software (the kind that comes with your scanner) tries to recognize text character by character. This approach works acceptably with templated, rigid documents, but as soon as the format changes, you're in trouble. A Hungarian construction invoice doesn't look like a German or English one. Every invoicing software uses a different layout, different fonts, different field names.
AI-based OCR approaches the task differently. The AI model doesn't search for characters — it understands the document's structure. It knows that the number following "Gross amount" is the total, even if it appears in a different font, a different position, or a different layout. It essentially reads the invoice the way a human would, just much faster.
AI OCR processing happens in three steps:
- Visual recognition: the model "looks at" the PDF as if it were an image
- Context interpretation: it identifies the fields (invoice number, amount, date, partner name, etc.)
- Structured output: it returns the data in organized JSON format that the system can process automatically
This approach doesn't depend on the invoice's format or layout. It can handle any Hungarian-language invoice.
The Solution: n8n + Mistral AI OCR#
For Murabau Kft., we built a system with three components. The whole point is simplicity: the user needs zero technical knowledge.
PDF upload via a web form
The user opens a simple web interface (provided by n8n), selects the PDF file, and presses the "Upload" button. They can choose whether it's an invoice or a completion certificate.
This is the only step a human needs to perform. Everything else happens automatically.
Mistral AI processes the document
The uploaded PDF goes to Mistral AI's picoVision model, which extracts the following data:
- Invoice number and issue date
- Issuer's name, address, tax ID
- Buyer's data (for verification)
- Line items with quantities and unit prices
- Net, VAT, and gross amounts
- Payment deadline and payment method
- Fulfillment date
The model returns the extracted data in structured JSON format, making it machine-processable.
Automatic filing in Airtable
The n8n workflow takes the JSON data and automatically records it in the Airtable register. During filing, the system:
- Categorizes the document into the appropriate group (incoming invoice, outgoing invoice, completion certificate)
- Assigns the partner (if already in the system)
- Fills in the amount fields
- Sets the payment deadline
- If uncertain about any data, flags it for review (with a yellow marker)
The entire process takes less than one minute per document. Previously, this was 5-10 minutes, and up to 15 minutes including verification.
Why We Chose Mistral AI#
Several AI OCR solutions are available on the market (Google Document AI, Azure Form Recognizer, OpenAI Vision). For the Murabau project, we chose Mistral's picoVision model, and there were specific reasons for this.
Hungarian language support. The Mistral model performs particularly well with Hungarian-language documents. This isn't a given: many AI models are optimized for English and struggle with Hungarian text. With Mistral, recognition accuracy for Hungarian invoices approaches English-language levels.
Native PDF processing. No need to convert the PDF to an image first. Mistral works directly from the PDF file, which results in faster and more accurate output.
Structured JSON output. The model's response isn't free text but a JSON object with predefined fields. This means the n8n workflow can handle the data immediately, with no additional text processing needed.
Cost efficiency. Processing costs a few forints per document. For 50 documents per month, that's a few hundred forints per month — compared to the cost of a full-time data entry person.
AI OCR doesn't just work with invoices. The same technology can be used for:
- Extracting key data from contracts (dates, amounts, parties)
- Processing delivery notes
- Comparing proposals
- Automatically recording completion certificates
- Any structured document where you need to repeatedly extract data
Results in Numbers#
| Metric | Before | After |
|---|---|---|
| Processing time / document | 5-10 minutes | less than 1 minute |
| Monthly admin time | 8-12 hours | less than 1 hour |
| Data entry errors | regular typos | 95%+ accuracy |
| Data entry method | manual Excel entry | automatic Airtable filing |
| Cost per document | human work hours | a few forints in AI processing |
Murabau's team now spends the 8-12 hours freed up monthly on actual work, not administration.
What Does It Look Like in Practice?#
A typical day at Murabau now looks like this: the finance team member opens their email, where 3-4 incoming invoices await. They upload the PDFs to the form one after another, which takes 10-15 seconds per document. By the time they finish with the last one, the first invoice's data is already in Airtable.
If the system considers a piece of data uncertain (for example, with handwritten invoices), it marks it in yellow. The team member reviews these and corrects them if needed. But for the majority of documents, no intervention is required.
What's more, the system learns: the more documents it processes from a given partner, the more accurately it recognizes that particular invoice format.
Who Should Consider This?#
- You manually process more than 10 documents per month
- Data entry is a recurring source of errors in your company
- Your team's time should be spent on more important tasks
- You want documents to be instantly searchable in your records
We build the system so that anyone can use it. No developer knowledge needed, nothing to install — a browser is enough.
Tech Stack#
| Tool | Role |
|---|---|
| n8n | Workflow engine, web form, data flow management |
| Mistral AI (picoVision) | Document recognition, data extraction, JSON output |
| Airtable | Data storage, record-keeping, search, filtering |
Related Articles#
If you're interested in how automation can help your company, check out these articles too:
- 5 processes every SMB can automate - invoice processing is just one of many possibilities
- When is it worth using AI for a small business? - not every task needs AI, but where it's needed, the savings are significant