Maritime shipping is responsible for transporting nearly 80% of global trade, operating across ocean environments that leave little margin for error. Each voyage must contend with severe weather and congested sea lanes, while maintaining compliance with international regulatory frameworks such as The International Regulations for Preventing Collisions at Sea (COLREGs). In this context, even a single navigational misjudgment has the potential to endanger human life and cause large-scale environmental harm.
Generic large language models know these rules but fail to apply them consistently. In collision scenarios, they generate advice that sounds plausible yet violates mandatory protocols. This regulatory consciousness gap makes them unusable where safety matters most.
The AI Alliance has been solving this challenge through its Foundation Models workgroup (FA5), developing open, domain-specific models for industries where trust is non-negotiable. At SemiCon West 2024, Alliance members Aitomatic and Tokyo Electron demonstrated this approach with SemiKong—the first semiconductor-specific large language model. Now, with Llamarine, Aitomatic and Furuno have proven this blueprint works across industries: from silicon fabs to the open sea.
Maritime Expertise Matters
Furuno brought essential domain expertise to Llamarine's development. With nearly 80% of its business in commercial ship navigation, Furuno has shaped how vessels perceive the sea for decades. The company pioneered the Loran long-range navigation system before GPS existed, and today leads in radar, sonar, and fish-finding technologies trusted by fleets worldwide.
This deep maritime heritage was crucial for Llamarine. While Aitomatic provided the AI engineering expertise, Furuno ensured the model was grounded not just in regulations, but in the practical judgment mariners rely on when lives are at stake. Every training decision was validated against real-world bridge operations.
"Furuno has advanced maritime technology for decades—from the world’s first commercially practical fish finder to radar and sonar—always aiming to make the ocean safer and more accessible. With Llamarine, we bring this into OCEAN5.0* era, creating AI that reflects seamanship itself—the expertise, judgment, and human-like knowledge trusted by officers and captains. It supports professionals in complex operations and allows recreational mariners to focus on what truly matters at sea, contributing to the future of the maritime industry."
— Konobu Kimura, Head of Intelligent Processing Technologies Laboratory, FURUNO ELECTRIC CO., LTD.
*OCEAN 5.0: Furuno’s envisioned future society concept, presenting a vision for the near future that aims for "COEXISTENCE AND CO-PROSPERITY WITH THE SEA," where humanity benefits from the ocean while also contributing to its sustainability and addressing social challenges, shifting from dependence to a harmonious balance.
How We Built Llamarine
Llamarine represents a true collaboration between AI innovation and maritime expertise. Aitomatic engineered the training pipeline and technical infrastructure, while Furuno provided the domain knowledge and validation that makes the model trustworthy for maritime professionals.
The model builds on Llama 3.1 70B through a sophisticated two-stage training process using 8× A100 80GB GPUs with QLoRA and NF4 quantization—enabling efficient domain adaptation without sacrificing the reasoning depth critical for safety decisions.
Phase 1: Domain Knowledge Injection
Pretrain with maritime-specific data, embed deep knowledge of regulations, vessel operations, and safety protocols directly into the model's weights.
Phase 2: Supervised Fine-Tuning
Calibrate how to execute tasks through instruction-response pairs, teaching the model to decompose complex maritime decisions into clear, actionable steps through our two-stage reasoning decomposition approach, as described in Figure 2.
Figure 1: Question generation pipeline. LLMs synthesized real-world scenarios from domain keywords, references, and sample human queries to produce 56,257 realistic training questions.
The Training Corpus:
- 117 authoritative textbooks covering COLREGS, SOLAS, MARPOL, STCW, and operational practices
- 901 research papers on navigation, autonomy, and optimization
- 56,257 fine-tuning examples spanning maritime concepts (4,852), mathematical reasoning (6,065), and operational challenges (45,340)
Our two-stage approach first analyzes questions—synthesized by the workflow described in Figure 1—to identify reasoning paths, and then generates answers based on these structured insights. This approach ensures consistent, reliable guidance that is critical for safety operations.
Figure 2: Answer generation process. Each question was decomposed into reasoning steps and paired with high-quality answers grounded in maritime practice.
Throughout this process, Furuno’s maritime experts validated outputs against real-world operations, ensuring Llamarine reflected seamanship in practice, not just theory. This industry-in-the-loop approach—pairing Aitomatic's AI capabilities with Furuno’s maritime knowledge—exemplifies how the Alliance brings together complementary expertise.
Technical Implementation Details
For engineers looking to replicate or extend our work:
- Model Architecture: Llama 3.1 70B base with RoPE positional encoding
- Training Configuration:
- QLoRA + NF4 quantization for memory efficiency
- Batch size 3 with gradient accumulation steps of 3
- Learning rate 1.0e-5 with cosine scheduler
- 0.15 warm-up ratio, trained for 2 epochs
- 0.05B parameters trained during training
- Data Pipeline: PyPDF → GPT-4o cleaning → Tiktoken BPE tokenization
- Deployment: GPTQ post-training quantization for production efficiency
Evaluation: Where Specialized Models Excel
We evaluated Llamarine against leading commercial and open-source models using a 1,065-question maritime benchmark (400 synthetic questions + 665 from Stack Exchange) covering theory, operations, and calculations. The results demonstrate the power of domain-specialized AI:
Overall Performance Scores:
Figure 3: Comparison of Llamarine over commercial models. Llamarine (green) surpasses GPT-4o, Sonnet 3.5, and GPT-4o-mini across all dimensions, particularly excelling in Practicality and Expert Communication.
The evaluation assessed six critical dimensions: Clarity & Directness (C&D), Practicality & Immediate Usability (PIU), Efficiency & Brevity (E&B), Logical Flow & Coherence (LFC), Expert-to-Expert Communication (EEC), and Use of Examples & Specificity (UES). Llamarine consistently outperformed alternatives across all dimensions, with particularly strong gains in Practicality and Expert Communication—precisely where Furuno’s maritime expertise proved most valuable.
Key Findings: Three Critical Differentiators
Our evaluation revealed where domain specialization transforms AI from interesting to indispensable:
1. Compliance as Architecture
While generic models generate plausible but non-compliant guidance, Llamarine consistently embeds regulatory compliance into its reasoning process. As shown in our qualitative comparison (Table 2 in the paper), when asked about collision avoidance, GPT-4o and Sonnet 3.5 primarily restate COLREGs rules, while Llamarine provides specific operational thresholds like CPA (Closest Point of Approach) distances and structured decision trees mariners actually use.
2. Deterministic Results for Critical Decisions
Our evaluation demonstrates that Llamarine produces consistent outputs for identical inputs through our two-stage reasoning decomposition—essential for operational trust when lives depend on consistency. Generic models showed variance in their responses, creating dangerous uncertainty in safety-critical situations.
"In our evaluations, Llamarine gave the same safe answer every time. That determinism is exactly what mariners need to trust AI."
— William Nguyen, Llamarine Tech Lead, Senior Applied Scientist, Aitomatic
3. Mastery of Edge Cases Through Operational Training
When facing complex scenarios like fog, heavy traffic, or conflicting sensor signals, Llamarine's training on 45,340 operational problems—validated by Furuno’s maritime experts—produces responses consistent and accurate with professional seamanship. Where generic models restate rules, Llamarine translates them into structured, actionable guidance with concrete examples.
Next: From Models to Expert Agents
Ships need more than COLREGS advice—they need real-time routing that handles weather, traffic, and regulations simultaneously. Fabs need more than defect classification—they need predictive maintenance that prevents yield loss.
The AI Alliance's roadmap is clear:
- Train industry specialized foundation models (SemiKong, Llamarine)
- Enable and build better domain expert agents that connect to real systems and execute decisions
By combining industry-specific models with agentic frameworks like Dana, Alliance members are building systems that actively leverage domain expertise to manage routes, monitor compliance, and execute complex workflows with the reliability professionals demand.
"Dana is the first open-source agentic OS that unifies natural language, symbolic reasoning, and structured execution in an open-source language purpose-built for agents. It provides a powerful platform for experimenting with adaptive, composable, and explainable AI systems while contributing to a growing global standard."
— Christopher Nguyen, CEO & Co-Founder of Aitomatic and Dana creator
This blueprint applies to any industry where expertise matters and mistakes carry consequences—healthcare, aviation, energy, and manufacturing. The Alliance brings domain experts and AI teams together to build these systems openly.
Join us.
Explore and Build Today
- Read the research paper: arXiv:2503.00203
- Try Llamarine on Hugging Face: aitomatic/Llamarine-70B
- Build Domain Expert Agents with Dana: github.com/aitomatic/dana
The AI Alliance Foundation Models workgroup (FA5) develops open, domain-specific AI for safety-critical industries. Learn more at thealliance.ai.
Llamarine was developed through collaboration between Aitomatic (technical lead and infrastructure) and Furuno (maritime domain expertise) as part of the AI Alliance's Foundation Models workgroup.