<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Generative AI Archives - OpenZeka EN Blog</title>
	<atom:link href="https://blog-en.openzeka.com/category/generative-ai/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog-en.openzeka.com/category/generative-ai/</link>
	<description>NVIDIA Jetson Developer Kits &#38;Edge Devices</description>
	<lastBuildDate>Fri, 27 Mar 2026 13:44:56 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>NVIDIA DGX Spark vs NVIDIA Jetson Thor</title>
		<link>https://blog-en.openzeka.com/nvidia-dgx-spark-vs-nvidia-jetson-thor/</link>
		
		<dc:creator><![CDATA[Betül Kaya]]></dc:creator>
		<pubDate>Tue, 23 Dec 2025 13:58:02 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<category><![CDATA[Performance]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1504</guid>

					<description><![CDATA[<p>One of the most common mistakes made when developing a ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/nvidia-dgx-spark-vs-nvidia-jetson-thor/">NVIDIA DGX Spark vs NVIDIA Jetson Thor</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-padding-right:0px;--awb-padding-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-text fusion-text-1"><p>One of the most common mistakes made when developing artificial intelligence systems is evaluating hardware designed for different purposes as if they were meant to solve the same problem. Although NVIDIA DGX Spark and NVIDIA Jetson Thor—two of NVIDIA’s recently prominent products—are often compared due to their similar names and emphasis on high performance, they are in fact two entirely different platforms designed to solve completely different problems.</p>
</div><div class="fusion-text fusion-text-2"><p>The purpose of this article is to clearly highlight the differences between DGX Spark and Jetson Thor and to make the following distinction explicit at the end:</p>
<p><strong>DGX Spark</strong> is designed for developing, training, and testing artificial intelligence models.<br />
<strong>Jetson Thor</strong>, on the other hand, is designed to run these models in the real world, on robots and physical systems.</p>
</div><div class="fusion-title title fusion-title-1 fusion-sep-none fusion-title-text fusion-title-size-two"><h2 class="fusion-title-heading title-heading-left" style="margin:0;">What is NVIDIA DGX Spark?</h2></div><div class="fusion-text fusion-text-3"><p>NVIDIA DGX Spark is a compact AI supercomputer positioned in a desktop form factor, designed to enable the development and execution of artificial intelligence models entirely in a local environment. At the heart of the system is the Grace Blackwell GB10 Superchip, which combines NVIDIA’s Grace CPU and Blackwell GPU architectures into a single chip. Thanks to this architectural integration, DGX Spark delivers up to 1 petaflop of AI computing performance along with 128 GB of high-bandwidth unified HBM3e memory. This makes it an extremely powerful local development platform for large language models and generative AI workloads.</p>
<p>Each DGX Spark can operate independently as a fully capable AI workstation. When two Spark devices are connected together, the system reaches a unified memory capacity of 256 GB, transforming into an expanded AI node capable of handling models with up to 405 billion parameters. While pairing a maximum of two units is currently supported, NVIDIA states that this limit may be increased in the future through software updates.</p>
<p>DGX Spark aims to reduce reliance on the cloud or data centers by enabling the following workloads to be performed entirely in a local environment.</p>
</div><div class="fusion-title title fusion-title-2 fusion-sep-none fusion-title-text fusion-title-size-two"><h2 class="fusion-title-heading title-heading-left" style="margin:0;">DGX Spark Use Cases</h2></div><div class="fusion-text fusion-text-4"><p><strong>Fine-Tuning</strong><br />
DGX Spark provides a powerful fine-tuning platform, especially for organizations working with enterprise, sensitive, or regulated data. In sectors such as finance, healthcare, defense, or law, large language models, image recognition systems, or task-specific AI models can be fine-tuned entirely locally without data leaving the organization. This approach ensures compliance with GDPR regulations and eliminates intellectual property risks.</p>
<p><strong>Inference and Local AI Services</strong><br />
DGX Spark enables low-latency, high-efficiency inference of trained models in desktop or local server environments. Chatbots, document analysis systems, visual inspection applications, or decision support systems can run in real time without relying on the cloud. As a result, performance improves while network dependency and data transfer risks are eliminated.</p>
<p><strong>Data Science and Analytics Workloads</strong><br />
For data scientists working with large datasets, DGX Spark consolidates data cleaning, model training, and evaluation steps into a single powerful platform. Thanks to GPU-accelerated computing, complex statistical analyses, simulations, and machine learning pipelines can be completed much faster. This provides a significant speed advantage, especially for Proof of Concept (PoC) and pilot projects.</p>
<p><strong>Transition from Cloud to Desktop and Desktop to Cloud</strong><br />
DGX Spark is designed to be fully compatible with the NVIDIA ecosystem. After developing and testing a model on DGX Spark, you can move it to DGX Cloud or other accelerated cloud infrastructures using the same codebase and software stack with little to no modification. This approach offers great flexibility for organizations adopting hybrid AI strategies.</p>
<p><strong>Working with Secure and Sensitive Data</strong><br />
DGX Spark is an ideal solution for scenarios where data must remain within the organization. Sensitive customer data, internal company documents, or confidential R&amp;D outputs can be processed and modeled locally without being uploaded to the cloud. This reduces cybersecurity risks and simplifies regulatory compliance.</p>
<p><strong>Education, Academic, and Enterprise AI Laboratories</strong><br />
For universities, research centers, and corporate AI teams, DGX Spark functions as a compact yet extremely powerful “AI laboratory.” Students and engineers can gain hands-on experience working with large-scale models on real hardware and develop scenarios that are much closer to production environments.</p>
</div><div class="fusion-title title fusion-title-3 fusion-sep-none fusion-title-text fusion-title-size-two"><h2 class="fusion-title-heading title-heading-left" style="margin:0;">What is NVIDIA Jetson Thor?</h2></div><div class="fusion-text fusion-text-5"><p>NVIDIA Jetson Thor is a high-performance edge AI platform developed for Physical AI, robotics, and autonomous systems. The core objective of Jetson Thor is to run large language models (LLMs), vision-language models (VLMs), and vision-language-action (VLA) models in real time with low latency and high energy efficiency. In this respect, Thor is positioned as the central “brain” of a robot or autonomous system, responsible for decision-making and action execution.<br />
Thanks to its Blackwell-based architecture, Jetson Thor delivers up to 2,070 TFLOPS (FP4 – sparsity-enabled) of AI computing performance, making it possible to deploy advanced models developed at data-center scale directly in edge environments. The Jetson Thor module family is optimized for Physical AI and robotics applications, combining high performance with a flexible power profile: configurable power consumption between 40 W and 130 W, along with up to 128 GB of memory.</p>
<p>This powerful hardware foundation allows LLM, VLM, and VLA models to run concurrently in a deterministic, low-latency manner. Its high energy efficiency makes Jetson Thor an ideal solution for 24/7 autonomous systems, robotic platforms, and mission-critical edge AI applications.</p>
</div><div class="fusion-text fusion-text-6"><p>The platform is optimized to process multiple data streams simultaneously from cameras, LiDAR, radar, and other sensors, enabling the entire perception–decision–action loop to be closed fully at the edge. Jetson Thor’s architecture targets continuously operating, time-sensitive systems that interact with the real world, rather than desktop- or data-center-oriented development environments.</p>
<p>In short, Jetson Thor is not a platform for developing AI models; it is an edge AI solution designed to run already developed models in the field, in the physical world, and in real time. Especially in robotics, autonomous vehicles, and Physical AI scenarios, it serves as a foundational building block for modern autonomous systems by unifying high computational power, low latency, sensor integration, and energy efficiency in a single platform.</p>
</div><div class="fusion-text fusion-text-7"><p>Jetson Thor’s high computational performance and extensive I/O capabilities make it an ideal solution across a wide range of industries. Below are some of the potential application areas of Jetson Thor:</p>
<ul>
<li><strong>Autonomous Systems (Vehicles and Robots)</strong><br />
By processing LiDAR, camera, and radar data simultaneously, Jetson Thor enables autonomous vehicles to perceive their environment and make safe decisions. Humanoid robots and unmanned aerial vehicles (UAVs) can also perform tasks such as real-time localization, mapping (SLAM), and obstacle detection more efficiently with Jetson Thor.</li>
<li><strong>Smart Cities and Public Safety</strong><br />
Jetson Thor can analyze 24/7 video streams from city surveillance cameras locally, without relying on the cloud. This enables instant traffic management, crowd monitoring, and detection of security threats. Thanks to its high memory capacity, Jetson Thor can analyze 4K/8K video streams in real time for smart city applications.</li>
<li><strong>Industrial Automation</strong><br />
When integrated into robotic arms or camera systems on production lines, Jetson Thor enables AI-driven tasks such as defect detection, quality control, and predictive maintenance to be performed in real time. Its rugged design and long-lifecycle industrial variants ensure reliable operation in harsh industrial environments.</li>
<li><strong>Healthcare Technologies</strong><br />
Medical devices and innovative healthcare systems can also benefit from Jetson Thor’s capabilities. For example, a portable MRI or ultrasound device can process images locally using AI to deliver instant diagnostic insights. When equipped with Jetson Thor, surgical robots can perform real-time image processing and precise control during operations. In addition, patient monitoring systems can process data locally while preserving privacy.</li>
<li><strong>Security and Surveillance</strong><br />
Smart security cameras can perform deep learning–based tasks such as facial recognition or threat detection in real time using Jetson Thor. This enhances security while reducing network traffic in environments such as banks, airports, and critical infrastructure. The system can detect suspicious situations on-site and send immediate alerts to security personnel.</li>
</ul>
</div>
<div class="table-1">
<table width="100%">
<thead>
<tr>
<th align="left">Feature</th>
<th align="left">NVIDIA DGX Spark</th>
<th align="left">NVIDIA Jetson Thor</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Primary Purpose</td>
<td align="left">AI development, training, testing</td>
<td align="left">Robotics and Physical AI Inference</td>
</tr>
<tr>
<td align="left">Deployment Environment</td>
<td align="left">Desktop / Office / Lab</td>
<td align="left">Edge / Robot / Autonomous systems</td>
</tr>
<tr>
<td align="left">LLM Prefill Performance</td>
<td align="left">Very high (compute-bound)</td>
<td align="left">Optimized for edge</td>
</tr>
<tr>
<td align="left">Power Consumption</td>
<td align="left">High</td>
<td align="left">Low and energy-efficient</td>
</tr>
<tr>
<td align="left">Real-Time Operation</td>
<td align="left">Not a priority</td>
<td align="left">Critical requirement</td>
</tr>
<tr>
<td align="left">Sensor Integration</td>
<td align="left">None</td>
<td align="left">Camera, LIDAR, radar etc.</td>
</tr>
<tr>
<td align="left">Target User</td>
<td align="left">AI developers, data scientists</td>
<td align="left">Robotics and embedded systems developers</td>
</tr>
</tbody>
</table>
</div>
<div class="fusion-text fusion-text-8" style="--awb-margin-top:15px;"><p>If your goal is Physical AI, robotics, autonomous driving, and edge inference:</p>
<ul>
<li>Jetson Thor is specifically designed for this purpose and is the right choice.<br />
If you need AI model development, training, testing, fine-tuning, and high-performance local computation.</li>
<li>DGX Spark is purpose-built exactly for these needs.</li>
</ul>
<p>For large-scale organizations, these two products are not competitors but complementary: You develop the model on DGX Spark and deploy it into the real world on Jetson Thor.</p>
</div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/nvidia-dgx-spark-vs-nvidia-jetson-thor/">NVIDIA DGX Spark vs NVIDIA Jetson Thor</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>HammerBench : AGX Thor’s Power Meets Ollama</title>
		<link>https://blog-en.openzeka.com/hammerbench-agx-thors-power-meets-ollama/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Wed, 17 Sep 2025 13:28:53 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1372</guid>

					<description><![CDATA[<p>What is an LLM benchmark and why is it important?  LLM  ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/hammerbench-agx-thors-power-meets-ollama/">HammerBench : AGX Thor’s Power Meets Ollama</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-2 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-1 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-4 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">What is an LLM benchmark and why is it important?</h3></div><div class="fusion-text fusion-text-9"><p><strong>LLM benchmarks</strong> are standardized tests designed to measure how fast, efficient, and accurate large language models (LLMs) perform across different hardware and environments. These tests evaluate metrics such as latency, throughput, and sometimes accuracy to provide an objective view of performance.</p>
<p>As LLMs continue to grow larger and more complex, choosing the right hardware to run them on becomes a critical decision. Benchmark results are essential to understand which device or infrastructure delivers better performance, to balance cost and efficiency, and to identify the most suitable solution for real-world use cases. In short, LLM benchmarks give both researchers and developers a clear roadmap of how models perform in practice.</p>
<p>To showcase the performance of <strong>Jetson AGX Thor</strong>, we are sharing our results and performance charts with you. At the same time, you can also run benchmarks across<strong> different GPU types</strong> to compare and validate performance for your own workloads. If you want to measure the performance metrics of your own devices and test your models under real-world conditions, get in touch with us. With our solution, your measurements turn into more than just numbers — they become actionable insights that drive strategic decisions.</p>
</div><div class="fusion-title title fusion-title-5 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">How to use HammerBench ?</h3></div><div class="fusion-text fusion-text-10"><p><strong>🖥️ What the App Does</strong></p>
<p>This is a Streamlit-based LLM Benchmark Tool interface designed to evaluate large language models (LLMs) on NVIDIA Jetson AGX Thor hardware using Ollama as the backend.</p>
<p>⚙️ <strong>Configuration (Left Sidebar)</strong></p>
<ul>
<li>GPU Information:</li>
<li>Detects if the device is a Jetson (in this case, a Jetson AGX Thor Developer Kit).</li>
<li>Shows details about the GPU (NVIDIA Jetson AGX Thor) and available memory (125,772 MB ≈ 122.8 GB).</li>
</ul>
<p><strong>Use Only GPU:</strong></p>
<p>A checkbox option that allows restricting benchmarks to GPU-only execution.</p>
<p><strong>📊 Main Panel</strong></p>
<p><strong>Title:</strong><em><strong> LLM Benchmark Tool with description: Benchmark LLM models using Ollama with real-time progress tracking.</strong></em></p>
<p><strong>Models Compatible with GPU memory (VRAM) requirements:</strong></p>
<ul>
<li>Displays a table of available models (llama3.2.1b, gemma3.4b, qwen3.14b, gpt-oss20b, etc.)</li>
<li>Shows how much memory (VRAM in GB) each model requires.</li>
<li>Marks them with ✅ if they are runnable on the detected GPU.</li>
</ul>
<p><strong>Select Models to Benchmark:</strong></p>
<ul>
<li>Lists the same models with checkboxes so the user can pick which ones to run benchmarks on.</li>
<li>Each option shows the memory requirement for clarity (e.g., gemma3.27b (17 GB), gpt-oss-120b (65 GB)).</li>
</ul>
<p><strong>🚀 Purpose</strong></p>
<p>The tool helps developers and researchers:</p>
<ul>
<li>See which LLMs are compatible with their GPU memory.</li>
<li>Select multiple models and run benchmarks to measure performance (latency, throughput, GPU utilization).</li>
<li>Use the results to compare models and make better deployment or scaling decisions.</li>
</ul>
</div><div class="fusion-video fusion-selfhosted-video" style="max-width:100%;"><div class="video-wrapper"><video playsinline="true" width="100%" style="object-fit: cover;" autoplay="true" muted="true" loop="true" preload="auto" controls="1"><source src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/animation.webm" type="video/webm">Sorry, your browser doesn&#039;t support embedded videos.</video></div></div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/hammerbench-agx-thors-power-meets-ollama/">HammerBench : AGX Thor’s Power Meets Ollama</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		<enclosure url="https://blog-en.openzeka.com/wp-content/uploads/2025/09/animation.webm" length="194501" type="video/webm" />

			</item>
		<item>
		<title>How to Run Llama.cpp Server on Jetson AGX Thor?</title>
		<link>https://blog-en.openzeka.com/how-to-run-llama-cpp-server-on-jetson-agx-thor/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Fri, 12 Sep 2025 10:44:53 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1410</guid>

					<description><![CDATA[<p>Llama.cpp Server on Jetson AGX Thor: Unlocking Edge AI  ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-llama-cpp-server-on-jetson-agx-thor/">How to Run Llama.cpp Server on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-3 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-2 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-blend:overlay;--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:0px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-6 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;">Llama.cpp Server on Jetson AGX Thor: Unlocking Edge AI with Large Language Models</h4></div><div class="fusion-text fusion-text-11"><p><strong>Llama.cpp Server</strong> is a lightweight, high-performance runtime for large language models (LLMs), designed to run efficiently on both CPU and GPU. Built in C++, it eliminates unnecessary overhead and delivers deep hardware-level optimizations. By supporting the <strong>GGUF model format,</strong> it allows for quantization, drastically reducing memory requirements while maintaining accuracy. Through its<strong> REST API,</strong> Llama.cpp Server can be seamlessly integrated into applications, enabling developers to bring advanced LLM capabilities directly to devices—without relying on the cloud.</p>
<p>When deployed on <strong>NVIDIA Jetson AGX Thor</strong>, the advantages become even more compelling:</p>
<ul>
<li>GPU acceleration with<strong> CUDA</strong> ensures that the Thor’s compute power is fully utilized, bringing real-time inference to the edge.</li>
<li>Optimized for edge AI use cases such as robotics, autonomous systems, and industrial automation, it provides ultra-low latency decision-making.</li>
<li>Resource efficiency via quantization makes it possible to run models from 7B up to 13B parameters within the limited memory budgets typical of embedded devices.</li>
</ul>
<p>By combining <strong>Llama.cpp Server</strong> with Jetson<strong> AGX Thor</strong>, organizations gain a powerful platform for on-device AI that is private, fast, and cost-effective. No data needs to leave the device, latency is minimized, and the system remains fully adaptable to both prototyping and production scenarios. Supported by an open-source ecosystem, this pairing represents a breakthrough for deploying large language models securely and efficiently at the edge.</p>
</div></div></div><div class="fusion-layout-column fusion_builder_column fusion-builder-column-3 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-7 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Requirements</h3></div><div class="fusion-text fusion-text-12"><ul>
<li>JetPack 7 (<span style="color: #76b900;"><a style="color: #76b900;" href="https://blog-en.openzeka.com/what-is-nvidia-jetpack-beginner-friendly-guide/">Learn more about JetPack</a></span>)</li>
<li>CUDA 13</li>
<li>At least 10 GB of free disk space<strong> (Only for the Llama Server image, not for the models.)</strong></li>
<li>A stable and fast internet connection</li>
</ul>
</div><div class="fusion-title title fusion-title-8 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;">How to use Llama.cpp Server ?</h4></div><div class="fusion-text fusion-text-13"><p>Firstly download the image ;</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-1 > .CodeMirror, .fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-1 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_1" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_1" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_1" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run --gpus all -it --rm \
  -p 8080:8080 \
  -v /workspace/models:/models \
  ghcr.io/nvidia-ai-iot/llama_cpp:r38.2.arm64-sbsa-cu130-24.04 \
  /bin/bash</textarea></div><div class="fusion-text fusion-text-14" style="--awb-margin-top:20px;"><p>Then, download the model from Hugging Face. If the model requires access, log in with your token by running:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-2 > .CodeMirror, .fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-2 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_2" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_2" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_2" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># huggingface-cli login
hf download Qwen/Qwen3-4B-Instruct-2507</textarea></div><div class="fusion-text fusion-text-15" style="--awb-margin-top:20px;"><p>Then, install the required Python dependencies with the following command:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-3 > .CodeMirror, .fusion-syntax-highlighter-3 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-3 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_3" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_3" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_3" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">pip install transformers torch mistral_common sentencepiece</textarea></div><div class="fusion-text fusion-text-16" style="--awb-margin-top:20px;"><p>This command set downloads the <strong>NVIDIA NVPL local repository package</strong>, installs it, adds the signing key to the system, and then installs the NVPL library via apt-get.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-4 > .CodeMirror, .fusion-syntax-highlighter-4 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-4 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_4" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_4" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_4" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">export NVPL_VERSION=25.5
export DISTRO=ubuntu2404

wget https://developer.download.nvidia.com/compute/nvpl/${NVPL_VERSION}/local_installers/nvpl-local-repo-${DISTRO}-${NVPL_VERSION}_1.0-1_arm64.deb

dpkg -i nvpl-local-repo-ubuntu2404-25.5_1.0-1_arm64.deb

cp /var/nvpl-local-repo-ubuntu2404-25.5/nvpl-local-52E38D21-keyring.gpg /usr/share/keyrings/

apt-get update && apt-get install -y nvpl</textarea></div><div class="fusion-text fusion-text-17" style="--awb-margin-top:20px;"><p>This command takes the Qwen2.5-VL-3B-Instruct model downloaded from Hugging Face (inside the snapshot folder identified by ), and uses the convert_hf_to_gguf.py tool to convert the Hugging Face weights (safetensors/PyTorch) into GGUF format, saving the output as /data/models/Qwen3-4B-Instruct-2507-f16.gguf.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-5 > .CodeMirror, .fusion-syntax-highlighter-5 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-5 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_5" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_5" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_5" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">python3 /opt/llama_cpp_python/vendor/llama.cpp/convert_hf_to_gguf.py \
  /data/models/huggingface/models--Qwen--Qwen2.5-VL-3B-Instruct/snapshots/<hash> \
  --outfile /data/models/Qwen3-4B-Instruct-2507-f16.gguf</textarea></div><div class="fusion-text fusion-text-18" style="--awb-margin-top:20px;"><p>This command takes the full-precision GGUF model (Qwen3-4B-Instruct-2507-f16.gguf) and runs it through llama-quantize to produce a quantized version <strong>(Qwen3-4B-Instruct-2507-q4_k_m.gguf)</strong> using the<strong> q4_k_m quantization method.</strong></p>
<ul>
<li><strong>Input file:</strong> /data/models/Qwen3-4B-Instruct-2507-f16.gguf (the FP16 model converted from Hugging Face).</li>
<li><strong>Output file:</strong> /data/models/Qwen3-4B-Instruct-2507-q4_k_m.gguf (smaller, quantized model).</li>
<li><strong>Quantization type:</strong> q4_k_m → a 4-bit quantization scheme optimized for speed and memory efficiency.</li>
</ul>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-6 > .CodeMirror, .fusion-syntax-highlighter-6 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-6 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_6" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_6" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_6" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">llama-quantize /data/models/Qwen3-4B-Instruct-2507-f16.gguf \
  /data/models/Qwen3-4B-Instruct-2507-q4_k_m.gguf q4_k_m</textarea></div><div class="fusion-text fusion-text-19" style="--awb-margin-top:20px;"><p>This command launches the llama.cpp server so the quantized model can be served via an<strong> HTTP API.</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-7 > .CodeMirror, .fusion-syntax-highlighter-7 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-7 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_7" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_7" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_7" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">llama-server \
  -m /data/models/Qwen3-4B-Instruct-2507-q4_k_m.gguf \
  --host 0.0.0.0 --port 8080 \
  -c 8192 \
  --n-gpu-layers 35</textarea></div><div class="fusion-text fusion-text-20" style="--awb-margin-top:20px;"><p>And thats it ! You can start chatting .</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-1 hover-type-zoomin"><img fetchpriority="high" decoding="async" width="1024" height="568" title="Screenshot from 2025-09-12 13-31-53" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-1024x568.png" alt class="img-responsive wp-image-1422" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-200x111.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-400x222.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-600x333.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-800x444.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-12-13-31-53-1200x666.png 1200w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-llama-cpp-server-on-jetson-agx-thor/">How to Run Llama.cpp Server on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Run MLC LLM on Jetson AGX Thor?</title>
		<link>https://blog-en.openzeka.com/how-to-run-mlc-llm-on-jetson-agx-thor/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Tue, 09 Sep 2025 10:21:01 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1322</guid>

					<description><![CDATA[<p>What is MLC LLM ? MLC LLM (Machine Learning Compilation ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-mlc-llm-on-jetson-agx-thor/">How to Run MLC LLM on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-4 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-padding-right:0px;--awb-padding-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-4 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-9 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">What is MLC LLM ?</h3></div><div class="fusion-text fusion-text-21"><p><strong>MLC LLM (Machine Learning Compilation for Large Language Models)</strong> is an open-source project designed to make large language models (LLMs) run efficiently across different hardware platforms. Its main goal is to optimize performance and reduce energy consumption, enabling AI applications to run not only in the cloud but also on edge devices.</p>
<p>NVIDIA’s next-generation <strong>Jetson AGX Thor platform</strong> delivers powerful computing capabilities for robotics, autonomous systems, and AI-driven applications. By leveraging <strong>MLC LLM</strong> on <strong>Jetson AGX Thor</strong>, large language models can be optimized to run in real time, supporting tasks such as natural language processing, decision-making, and human-like interaction with higher efficiency.</p>
</div></div></div><div class="fusion-layout-column fusion_builder_column fusion-builder-column-5 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-text fusion-text-22"><p>In short, <strong>MLC LLM</strong> on <strong>Jetson AGX Thor</strong> acts as a bridge that brings high-performance large language model capabilities to edge devices.</p>
</div><div class="fusion-title title fusion-title-10 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Requirements</h3></div><div class="fusion-text fusion-text-23"><ul>
<li>JetPack 7 (<span style="color: #76b900;"><a style="color: #76b900;" href="https://blog-en.openzeka.com/what-is-nvidia-jetpack-beginner-friendly-guide/">Learn more about JetPack</a></span>)</li>
<li>CUDA 13</li>
<li>At least 25 GB of free disk space<strong> (Only for the MLC LLM image, not for the models.)</strong></li>
<li>A stable and fast internet connection</li>
</ul>
</div><div class="fusion-title title fusion-title-11 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">How to use <i>MLC</i> LLM ?</h3></div><div class="fusion-text fusion-text-24"><p>First, install the <i>Docker</i> image on your computer:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-8 > .CodeMirror, .fusion-syntax-highlighter-8 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-8 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_8" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_8" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_8" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">sudo docker run -it --rm \
  --runtime nvidia \
  --gpus all \
  -v /workspace:/workspace \
  -p 6678:6678 \
  -p 6677:6677 \
  ghcr.io/nvidia-ai-iot/mlc:r38.2.arm64-sbsa-cu130-24.04 </textarea></div><div class="fusion-text fusion-text-25" style="--awb-margin-top:20px;"><p>If you’d like to explore the available images or replace them with newer ones, you can visit the <strong><a style="color: #14ce00;" href="http://ghcr.io/nvidia-ai-iot/mlc">GitHub Container Registry.</a></strong></p>
</div><div class="fusion-text fusion-text-26"><p>Once inside the container, find the model you want to download from Hugging Face.<br />
Use the hf download command inside the container to download the model.</p>
<p><strong>For example:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-9 > .CodeMirror, .fusion-syntax-highlighter-9 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-9 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_9" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_9" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_9" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">hf download Qwen/Qwen3-30B-A3B-Instruct-2507</textarea></div><div class="fusion-text fusion-text-27" style="--awb-margin-top:20px;"><p>In the next step, provide the folder where you downloaded the model and run the command below.<br />
This command converts the model’s original Hugging Face weights (in safetensor format) into the optimized<strong> MLC LLM format.</strong> During conversion, the weights are quantized (e.g., to <strong>q4bf16_1</strong>), which reduces memory usage and improves runtime efficiency on GPU without heavily sacrificing accuracy.</p>
<p>In short,<strong> mlc_llm convert_weight</strong> takes the raw model checkpoint and transforms it into a format that can be directly executed by the MLC runtime on your target device (e.g., Jetson AGX Thor with CUDA).</p>
<p><em><strong>⚠️ Warning:</strong> In the command, replace in <strong>snapshots//</strong> with the actual folder name you see inside the snapshots directory<strong> (e.g., aeb13307a71acd8fe81861d94ad54ab689df&#8230;)</strong>. This folder contains the real model files such as config.json, tokenizer.json, and model.safetensors, which are required for the <strong>mlc_llm convert_weight</strong> command to work.</em></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-10 > .CodeMirror, .fusion-syntax-highlighter-10 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-10 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_10" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_10" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_10" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">mlc_llm convert_weight /data/models/huggingface/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/<hash>/ \
    --quantization q4bf16_1 \
    --model-type qwen3 \
    --device cuda \
    --source-format huggingface-safetensor \
    -o /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1</textarea></div><div class="fusion-text fusion-text-28" style="--awb-margin-top:20px;"><p>In the next step , <strong>gen_config</strong> generates the configuration files needed to run the converted model in <strong>MLC</strong>. It defines the conversation template (<strong>e.g., Qwen format</strong>), context length, batch size, and other runtime parameters. In short, it makes the weight-converted model fully executable in the MLC runtime.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-11 > .CodeMirror, .fusion-syntax-highlighter-11 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-11 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_11" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_11" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_11" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">mlc_llm gen_config \
    /data/models/huggingface/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/<hash>/config.json \
    --quantization q4bf16_1 \
    --conv-template qwen2 \
    --context-window-size 32768 \
    --prefill-chunk-size 4096 \
    --max-batch-size 3 \
    --output /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1
</textarea></div><div class="fusion-text fusion-text-29" style="--awb-margin-top:20px;"><p><em><strong>⚠️ Note:</strong> The “Not found” messages for files like tokenizer.model or added_tokens.json are not errors. These files are optional and not required by all models. As long as t<strong>okenizer.json, vocab.json, and merges.txt</strong> are found and copied, the model configuration is complete and ready to run.</em></p>
<p>Now that the configuration is ready, we can move on to the compilation step. In this stage, the model is compiled into a CUDA-optimized shared library (.so file), which enables fast execution on the GPU.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-12 > .CodeMirror, .fusion-syntax-highlighter-12 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-12 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_12" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_12" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_12" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">mlc_llm compile \
    /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1/mlc-chat-config.json \
    --device cuda \
    -o /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1/Qwen3-30B-A3B-Instruct-2507-q4bf16_1-cuda.so \
    --quantization q4bf16_1 \
    --model-type qwen3 \
    --opt="cublas_gemm=1;cudagraph=1"</textarea></div><div class="fusion-text fusion-text-30" style="--awb-margin-top:20px;"><p>With the compilation complete, the final step is to serve the model so it can handle inference requests. The<strong> mlc_llm serve</strong> command launches an HTTP server that exposes the model as an API endpoint, making it accessible for testing or integration into applications.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-13 > .CodeMirror, .fusion-syntax-highlighter-13 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-13 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_13" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_13" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_13" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">mlc_llm serve /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1 \
  --port 6678 \
  --host 0.0.0.0 \
  --device cuda \
  --mode interactive \
  --model-lib /workspace/models/mlc/Qwen3-30B-A3B-Instruct-2507-q4bf16_1/Qwen3-30B-A3B-Instruct-2507-q4bf16_1-cuda.so \
  --overrides "max_num_sequence=1;max_total_seq_length=32768;context_window_size=32768;gpu_memory_utilization=0.3"</textarea></div><div class="fusion-text fusion-text-31" style="--awb-margin-top:20px;"><p><em><strong>If you see this output, it means the model has been successfully compiled and serving .</strong></em></p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-2 hover-type-none"><img decoding="async" width="453" height="69" title="Screenshot from 2025-09-08 14-56-47" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-56-47.png" alt class="img-responsive wp-image-1333" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-56-47-200x30.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-56-47-400x61.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-56-47.png 453w" sizes="(max-width: 640px) 100vw, 453px" /></span></div><div class="fusion-text fusion-text-32" style="--awb-margin-top:20px;"><p>You can test it with this curl request ;</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-14 > .CodeMirror, .fusion-syntax-highlighter-14 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-14 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_14" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_14" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_14" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="hopscotch" data-mode="text/x-sh">curl -X POST http://localhost:6678/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model-name>",
    "messages": [
      {"role": "system", "content": "You are a helpful AI assistant."},
      {"role": "user", "content": "Hello !"}
    ],
    "temperature": 0.7,
    "max_tokens": 512,
    "stream": false
  }'</textarea></div><div class="fusion-title title fusion-title-12 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;">Which Jetson should I choose for my LLM model?</h4></div><div class="fusion-text fusion-text-33"><p>Below, you can find the RAM requirements of the most popular LLM models along with Jetson recommendations that meet the minimum specifications to run them. You can choose the one that best fits your needs.</p>
</div>
<div class="table-1">
<table width="100%">
<thead>
<tr>
<th align="left">Model</th>
<th align="left">Parameters</th>
<th align="left">Quantization</th>
<th align="left">Required RAM (GB)</th>
<th align="left">Recommended Minimum Jetson</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Base</td>
<td align="left">684B</td>
<td align="left">Dynamic-1.58-bit</td>
<td align="left">162.11</td>
<td align="left">Not supported (≥128 GB and above)</td>
</tr>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Distill-Qwen-1.5B</td>
<td align="left">1.5B</td>
<td align="left">Q4_K_M</td>
<td align="left">0.90</td>
<td align="left">Jetson Orin Nano 4 GB, Jetson Nano 4 GB</td>
</tr>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Distill-Qwen-7B</td>
<td align="left">7B</td>
<td align="left">Q5_K_M</td>
<td align="left">5.25</td>
<td align="left">Jetson Orin Nano 8 GB, Jetson Orin NX 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">mistralai Mixtral 8x22B-Instruct-v0.1</td>
<td align="left">22B</td>
<td align="left">Q4_K_M</td>
<td align="left">13.20</td>
<td align="left">Jetson Orin NX 16 GB, Jetson AGX Orin 32 GB, Jetson AGX Xavier 32 GB</td>
</tr>
<tr>
<td align="left">mistralai Mathstral 7B-v0.1</td>
<td align="left">7B</td>
<td align="left">Q5_K_M</td>
<td align="left">5.25</td>
<td align="left">Jetson Orin Nano 8 GB, Jetson Orin NX 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">google gemma-3 12b-it</td>
<td align="left">12B</td>
<td align="left">Q4_K_M</td>
<td align="left">7.20</td>
<td align="left">Jetson Orin NX 8 GB, Jetson Orin Nano 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">meta-llama Llama-3.1 70B-Instruct</td>
<td align="left">70B</td>
<td align="left">Q5_K_M</td>
<td align="left">52.50</td>
<td align="left">Jetson AGX Orin 64 GB, Jetson AGX Xavier 64 GB, Jetson AGX Thor (T5000) 128 GB</td>
</tr>
</tbody>
</table>
</div>
</div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-mlc-llm-on-jetson-agx-thor/">How to Run MLC LLM on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Run vLLM on Jetson AGX Thor?</title>
		<link>https://blog-en.openzeka.com/how-to-run-vllm-on-jetson-agx-thor/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Tue, 09 Sep 2025 10:16:57 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1338</guid>

					<description><![CDATA[<p>What is vLLM and Why Does It Matter on Jetson AGX Thor? ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-vllm-on-jetson-agx-thor/">How to Run vLLM on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-5 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-6 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-13 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">What is vLLM and Why Does It Matter on Jetson AGX Thor?</h3></div><div class="fusion-text fusion-text-34"><p><strong>vLLM</strong> is an open-source inference engine designed to run large language models (LLMs) with exceptional efficiency. Thanks to its innovative PagedAttention architecture, vLLM delivers both high throughput and low latency making it possible to deploy advanced AI models in real-time applications.</p>
<p>On the other side, NVIDIA Jetson AGX Thor is a next-generation edge AI platform built for robotics, autonomous machines, and industrial systems. With its immense compute power and AI acceleration, Thor is the perfect hardware to unlock the full potential of LLMs at the edge.</p>
<p>When combined, vLLM on Jetson AGX Thor enables:</p>
<ul>
<li><strong>Real-time LLM services (chatbots, assistants, summarization, translation)</strong></li>
<li><strong>Vision + Language use cases (explaining camera input instantly)</strong></li>
<li><strong>On-device inference with ultra-low latency and stronger data privacy</strong></li>
<li><strong>Reduced reliance on cloud resources, with better energy efficiency</strong></li>
</ul>
<p>In short, vLLM provides the software intelligence, Thor provides the hardware muscle together they make cutting-edge LLM experiences possible directly on the device.</p>
</div><div class="fusion-title title fusion-title-14 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Installing Process</h3></div><div class="fusion-text fusion-text-35"><p>First, download the following Triton Inference Server container image.<br />
This image comes with vLLM version 0.9.2 pre-installed. The tag 25.08 refers to August 2025.</p>
<p>If you’d like to update to a newer version in the future, you can always visit the <strong><a style="color: #00dd37;" href="https://catalog.ngc.nvidia.com/?filters=&amp;orderBy=weightPopularDESC&amp;query=&amp;page=&amp;pageSize=">NVIDIA NGC Catalog</a></strong> to find the latest container releases.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-15 > .CodeMirror, .fusion-syntax-highlighter-15 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-15 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_15" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_15" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_15" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run --name vllm_container -it \
  --gpus all \
  -p 8000:8000 \
  -v $HOME/.cache/huggingface:/root/.cache/huggingface \
  nvcr.io/nvidia/tritonserver:25.08-vllm-python-py3 bash</textarea></div><div class="fusion-text fusion-text-36" style="--awb-margin-top:20px;"><p>You can verify the installed vLLM version directly with Python.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-16 > .CodeMirror, .fusion-syntax-highlighter-16 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-16 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_16" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_16" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_16" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">python3 -c "import vllm; print(vllm.__version__)"</textarea></div><div class="fusion-text fusion-text-37" style="--awb-margin-top:20px;"><p>Next, you’ll need to create an account on Hugging Face , generate an access token, and log in with it.</p>
<p>This token will allow the container to securely download and run models directly from <a href="https://huggingface.co/"><strong style="color: #00e200;">Hugging Face.</strong></a></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-17 > .CodeMirror, .fusion-syntax-highlighter-17 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-17 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_17" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_17" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_17" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">huggingface-cli login</textarea></div><div class="fusion-text fusion-text-38" style="--awb-margin-top:20px;"><p>To download model run ;</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-18 > .CodeMirror, .fusion-syntax-highlighter-18 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-18 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_18" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_18" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_18" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">hf download <model></textarea></div><div class="fusion-text fusion-text-39" style="--awb-margin-top:20px;"><p>Once your environment is ready, you can launch the vLLM API server using the following command:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-19 > .CodeMirror, .fusion-syntax-highlighter-19 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-19 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_19" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_19" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_19" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">python3 -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --tensor-parallel-size 1 \
  --gpu-memory-utilization 0.90 \
  --max-model-len 8192 \
  --dtype float16</textarea></div><div class="fusion-text fusion-text-40" style="--awb-margin-top:20px;"><p>Here’s what each parameter does:</p>
<ul>
<li><strong><em>&#8211;model</em> →</strong> specifies which model to load (in this case, Llama-3.1-8B-Instruct from Hugging Face).</li>
<li><em><strong>&#8211;tensor-parallel-size 1</strong> </em>→ runs the model on a single GPU. If you have multiple GPUs, you can increase this value.</li>
<li><em><strong>&#8211;gpu-memory-utilization 0.90</strong></em> → tells vLLM to use up to 90% of available GPU memory. Adjust this if you run into memory errors.</li>
<li><em><strong>&#8211;max-model-len 8192 →</strong></em> sets the maximum context length (in tokens) for the model.</li>
<li><em><strong>&#8211;dtype float16 →</strong> </em>runs the model in FP16 precision, which is more efficient on Jetson AGX Thor.</li>
</ul>
</div><div class="fusion-text fusion-text-41"><p><em><strong>⚠️ Heads-up: If you encounter ;</strong></em></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-20 > .CodeMirror, .fusion-syntax-highlighter-20 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-20 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_20" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_20" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_20" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}</textarea></div><div class="fusion-text fusion-text-42" style="--awb-margin-top:20px;"><p><em><strong>It usually means the engine couldn’t reserve enough GPU memory. Try lowering the GPU memory utilization. For example try with &#8211;gpu-memory-utilization 0.75 .</strong></em></p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-3 hover-type-none"><img decoding="async" width="1024" height="617" title="Screenshot from 2025-09-09 09-33-20" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-1024x617.png" alt class="img-responsive wp-image-1345" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-200x120.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-400x241.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-600x361.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-800x482.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20-1200x723.png 1200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-09-09-33-20.png 1393w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-text fusion-text-43" style="--awb-margin-top:20px;"><p>If you see a message like:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-21 > .CodeMirror, .fusion-syntax-highlighter-21 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-21 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_21" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_21" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_21" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">Starting vLLM API server 0 on http://0.0.0.0:8000</textarea></div><div class="fusion-text fusion-text-44" style="--awb-margin-top:20px;"><p>it means that vLLM is now serving on port 8000 and ready to accept requests.<br />
At this point, you can start testing it with a simple curl command. For example:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-22 > .CodeMirror, .fusion-syntax-highlighter-22 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-22 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_22" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_22" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_22" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "messages": [{"role": "user", "content": "Hello Jetson AGX Thor!"}],
    "max_tokens": 64
  }'</textarea></div><div class="fusion-title title fusion-title-15 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;">Which Jetson should I choose for my LLM model?</h4></div><div class="fusion-text fusion-text-45"><p>Below, you can find the RAM requirements of the most popular LLM models along with Jetson recommendations that meet the minimum specifications to run them. You can choose the one that best fits your needs.</p>
</div>
<div class="table-1">
<table width="100%">
<thead>
<tr>
<th align="left">Model</th>
<th align="left">Parameters</th>
<th align="left">Quantization</th>
<th align="left">Required RAM (GB)</th>
<th align="left">Recommended Minimum Jetson</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Base</td>
<td align="left">684B</td>
<td align="left">Dynamic-1.58-bit</td>
<td align="left">162.11</td>
<td align="left">Not supported (≥128 GB and above)</td>
</tr>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Distill-Qwen-1.5B</td>
<td align="left">1.5B</td>
<td align="left">Q4_K_M</td>
<td align="left">0.90</td>
<td align="left">Jetson Orin Nano 4 GB, Jetson Nano 4 GB</td>
</tr>
<tr>
<td align="left">deepseek-ai Deepseek-R1 Distill-Qwen-7B</td>
<td align="left">7B</td>
<td align="left">Q5_K_M</td>
<td align="left">5.25</td>
<td align="left">Jetson Orin Nano 8 GB, Jetson Orin NX 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">mistralai Mixtral 8x22B-Instruct-v0.1</td>
<td align="left">22B</td>
<td align="left">Q4_K_M</td>
<td align="left">13.20</td>
<td align="left">Jetson Orin NX 16 GB, Jetson AGX Orin 32 GB, Jetson AGX Xavier 32 GB</td>
</tr>
<tr>
<td align="left">mistralai Mathstral 7B-v0.1</td>
<td align="left">7B</td>
<td align="left">Q5_K_M</td>
<td align="left">5.25</td>
<td align="left">Jetson Orin Nano 8 GB, Jetson Orin NX 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">google gemma-3 12b-it</td>
<td align="left">12B</td>
<td align="left">Q4_K_M</td>
<td align="left">7.20</td>
<td align="left">Jetson Orin NX 8 GB, Jetson Orin Nano 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">meta-llama Llama-3.1 70B-Instruct</td>
<td align="left">70B</td>
<td align="left">Q5_K_M</td>
<td align="left">52.50</td>
<td align="left">Jetson AGX Orin 64 GB, Jetson AGX Xavier 64 GB, Jetson AGX Thor (T5000) 128 GB</td>
</tr>
</tbody>
</table>
</div>
</div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-vllm-on-jetson-agx-thor/">How to Run vLLM on Jetson AGX Thor?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Run Ollama on Jetson AGX Thor with OpenwebUI?</title>
		<link>https://blog-en.openzeka.com/how-to-run-ollama-on-jetson-agx-thor-with-openwebui/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Tue, 09 Sep 2025 10:11:28 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1301</guid>

					<description><![CDATA[<p>What is Ollama?  Ollama is a lightweight and flexible p ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-ollama-on-jetson-agx-thor-with-openwebui/">How to Run Ollama on Jetson AGX Thor with OpenwebUI?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-6 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-7 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-title title fusion-title-16 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">What is Ollama?</h3></div><div class="fusion-text fusion-text-46"><p>Ollama is a lightweight and flexible platform that allows you to run large language models (LLMs) directly on your own device. When running on powerful AI hardware such as the <strong>NVIDIA Jetson AGX Thor</strong>, it provides a local, fast, and secure experience without the need for cloud-based solutions.</p>
<p>Thanks to the high processing power of Jetson AGX Thor, Ollama:</p>
<ul>
<li><strong>Runs LLMs locally</strong> → Can be used even without an internet connection.</li>
<li><strong>Utilizes hardware acceleration</strong> → Leverages GPU power to generate faster responses.</li>
<li><strong>Ensures data privacy</strong> → All processing happens on-device, so sensitive data never leaves the system.</li>
<li><strong>Offers flexibility</strong> → Different models can be downloaded, customized, and tested.</li>
</ul>
<p>In short, Ollama leverages the hardware advantages of AGX Jetson Thor to make AI applications more accessible, portable, and secure.</p>
</div><div class="fusion-title title fusion-title-17 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Requirements for AGX Thor</h3></div><div class="fusion-text fusion-text-47"><ol>
<li>JetPack 7 must be installed</li>
<li>Stable high-speed internet connection</li>
<li>At least 15 GB of free disk space (excluding model storage for Ollama itself)</li>
</ol>
</div><div class="fusion-title title fusion-title-18 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Installation Process</h3></div><div class="fusion-text fusion-text-48"><p>First, we create a folder to mount into the container.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-23 > .CodeMirror, .fusion-syntax-highlighter-23 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-23 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_23" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_23" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_23" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">mkdir ~/ollama-data/</textarea></div><div class="fusion-text fusion-text-49" style="--awb-margin-top:20px;"><p>Next, we download the image from the <strong>GitHub Container Registry.</strong><br />
The <strong>ghcr.io</strong> prefix indicates that the image is hosted on the GitHub Container Registry.</p>
<p>To access other images or check for the latest updates, you can visit the following <strong><a style="color: #2a9e00;" href="https://github.com/orgs/NVIDIA-AI-IOT/packages/container/package/ollama">link.</a></strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-24 > .CodeMirror, .fusion-syntax-highlighter-24 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-24 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_24" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_24" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_24" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run --rm -it -v ${HOME}/ollama-data:/data ghcr.io/nvidia-ai-iot/ollama:r38.2.arm64-sbsa-cu130-24.04</textarea></div><div class="fusion-text fusion-text-50" style="--awb-margin-top:20px;"><p>It will take some time to pull (download) the container image.</p>
<p>Once in the container, you will see something like this.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-4 hover-type-none"><img decoding="async" width="848" height="817" title="Screenshot from 2025-09-08 11-44-34" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34.png" alt class="img-responsive wp-image-1306" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34-200x193.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34-400x385.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34-600x578.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34-800x771.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-44-34.png 848w" sizes="(max-width: 640px) 100vw, 848px" /></span></div><div class="fusion-text fusion-text-51" style="--awb-margin-top:20px;"><p>Try running a GPT OSS (20b parameter) model by issuing a command below.</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-25 > .CodeMirror, .fusion-syntax-highlighter-25 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-25 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_25" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_25" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_25" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">ollama run --verbose gpt-oss:20b</textarea></div><div class="fusion-text fusion-text-52" style="--awb-margin-top:20px;"><p>Once ready, it will show something like this:</p>
</div><div class="fusion-image-element " style="text-align:center;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-5 hover-type-none"><img decoding="async" width="697" height="522" title="Screenshot from 2025-09-08 11-50-28" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-50-28.png" alt class="img-responsive wp-image-1310" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-50-28-200x150.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-50-28-400x300.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-50-28-600x449.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-11-50-28.png 697w" sizes="(max-width: 640px) 100vw, 697px" /></span></div><div class="fusion-title title fusion-title-19 fusion-sep-none fusion-title-text fusion-title-size-three" style="--awb-margin-top:20px;"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Troubleshooting</h3></div><div class="fusion-text fusion-text-53"><p><strong>CUDA out of memory</strong></p>
<p>If you encounter CUDA out of memory errors, try running a <strong>smaller model.</strong><br />
You can also use quantization to reduce memory usage and run models more efficiently on your device.</p>
<p>Different model sizes and quantized versions can be found <strong><a style="color: #1bcc00;" href="https://ollama.com">here</a><span style="color: #1bcc00;">.</span> </strong></p>
</div><div class="fusion-title title fusion-title-20 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;">Installing OpenwebUI</h3></div><div class="fusion-text fusion-text-54"><p>Firsty run this command on terminal ;</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-26 > .CodeMirror, .fusion-syntax-highlighter-26 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-26 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_26" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_26" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_26" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run -it --rm --network=host --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main</textarea></div><div class="fusion-text fusion-text-55" style="--awb-margin-top:20px;"><p><em>If you see the <strong>&#8220;application startup&#8221;</strong> message on the screen, you can proceed to the next step.</em><br />
<em>If it says <strong>&#8220;retrying&#8221;</strong> and you don’t see any progress in the download section, stop the process with <strong>Control + C</strong> and try again or just wait. There should be no problem.</em></p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-6 hover-type-none"><img decoding="async" width="960" height="589" title="Screenshot from 2025-09-08 13-31-30" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30.png" alt class="img-responsive wp-image-1314" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30-200x123.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30-400x245.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30-600x368.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30-800x491.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-31-30.png 960w" sizes="(max-width: 640px) 100vw, 960px" /></span></div><div class="fusion-text fusion-text-56" style="--awb-margin-top:20px;"><p>You can then navigate your browser to <em><strong>http://JETSON_IP:8080</strong></em> , and create a fake account to log in (these credentials are only local). Instead of <strong>JETSON_IP</strong>, you can also use localhost.</p>
<p>Create an account .</p>
</div><div class="fusion-image-element " style="text-align:center;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-7 hover-type-none"><img decoding="async" width="613" height="482" title="Screenshot from 2025-09-08 13-36-48" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-36-48.png" alt class="img-responsive wp-image-1315" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-36-48-200x157.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-36-48-400x315.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-36-48-600x472.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-13-36-48.png 613w" sizes="(max-width: 640px) 100vw, 613px" /></span></div><div class="fusion-text fusion-text-57"><p><em><strong>⚠️ Be careful !</strong> When OpenWebUI is launched, <strong>no model</strong> will appear in the <strong>Load Models</strong> section at the top left. To connect models to <strong>OpenWebUI</strong>, we need to assign a port. Restart the Ollama container with the following command:</em></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-27 > .CodeMirror, .fusion-syntax-highlighter-27 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-27 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_27" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_27" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_27" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run --rm -it \
  -p 11434:11434 \
  -v ${HOME}/ollama-data:/data \
  ghcr.io/nvidia-ai-iot/ollama:r38.2.arm64-sbsa-cu130-24.04</textarea></div><div class="fusion-text fusion-text-58" style="--awb-margin-top:20px;"><p data-start="53" data-end="98">You can check it by sending a <strong>curl request:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-28 > .CodeMirror, .fusion-syntax-highlighter-28 > .CodeMirror .CodeMirror-gutters {background-color:#000000;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-28 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_28" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_28" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_28" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">curl http://localhost:11434</textarea></div><div class="fusion-text fusion-text-59" style="--awb-margin-top:20px;"><p>If you see “<strong>Ollama is running</strong>”, you can continue using it.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-8 hover-type-none"><img decoding="async" width="1024" height="261" title="Screenshot from 2025-09-08 14-08-00" src="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00-1024x261.png" alt class="img-responsive wp-image-1320" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00-200x51.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00-400x102.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00-600x153.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00-800x204.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/09/Screenshot-from-2025-09-08-14-08-00.png 1058w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-title title fusion-title-21 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;">Which Jetson should I choose for my LLM model?</h4></div><div class="fusion-text fusion-text-60 fusion-text-no-margin" style="--awb-margin-bottom:-20px;"><p>Below, you can find the RAM requirements of the most popular LLM models along with Jetson recommendations that meet the minimum specifications to run them. You can choose the one that best fits your needs.</p>
</div>
<div class="table-1">
<p>&nbsp;</p>
<table width="100%">
<thead>
<tr>
<th align="left">Model</th>
<th align="left">Parameters</th>
<th align="left">Quantization</th>
<th align="left">Required RAM (GB)</th>
<th align="left">Recommended Minimum Jetson</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">DeepSeek-R1</td>
<td align="left">671B</td>
<td align="left">Dynamic-1.58-bit (MoE 1.5-bit + other layers 4–6-bit)</td>
<td align="left">159.03</td>
<td align="left">Not supported (≥128 GB and above)</td>
</tr>
<tr>
<td align="left">DeepSeek-R1 Distill-Qwen-1.5B</td>
<td align="left">1.5B</td>
<td align="left">Q4_K_M</td>
<td align="left">0.90</td>
<td align="left">Jetson Orin Nano 4 GB, Jetson Nano 4 GB</td>
</tr>
<tr>
<td align="left">DeepSeek-R1 Distill-Qwen-7B</td>
<td align="left">7B</td>
<td align="left">Q5_K_M</td>
<td align="left">5.25</td>
<td align="left">Jetson Orin Nano 8 GB, Jetson Orin NX 8 GB, Jetson Xavier NX 8 GB</td>
</tr>
<tr>
<td align="left">Qwen 2.5</td>
<td align="left">14B</td>
<td align="left">FP16</td>
<td align="left">33.60</td>
<td align="left">Jetson AGX Orin 64 GB, Jetson AGX Xavier 64 GB</td>
</tr>
<tr>
<td align="left">CodeLlama</td>
<td align="left">34B</td>
<td align="left">Q4_K_M</td>
<td align="left">20.40</td>
<td align="left">Jetson AGX Orin 32 GB, Jetson AGX Xavier 32 GB</td>
</tr>
<tr>
<td align="left">Llama 3.2 Vision</td>
<td align="left">90B</td>
<td align="left">Q5_K_M</td>
<td align="left">67.50</td>
<td align="left">Jetson AGX Thor (T5000) 128 GB</td>
</tr>
<tr>
<td align="left">Phi-3</td>
<td align="left">3.8B</td>
<td align="left">FP16</td>
<td align="left">9.12</td>
<td align="left">Jetson Orin NX 16 GB</td>
</tr>
</tbody>
</table>
</div>
</div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/how-to-run-ollama-on-jetson-agx-thor-with-openwebui/">How to Run Ollama on Jetson AGX Thor with OpenwebUI?</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Jetson Generative AI – JupyterLab Development Environment</title>
		<link>https://blog-en.openzeka.com/jetson-generative-ai-jupyterlab-development-environment/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Thu, 07 Aug 2025 13:11:04 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1130</guid>

					<description><![CDATA[<p>JupyterLab turns your Jetson into a powerful interacti ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-jupyterlab-development-environment/">Jetson Generative AI – JupyterLab Development Environment</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-7 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-8 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-text fusion-text-61"><p><strong>JupyterLab</strong> turns your Jetson into a powerful interactive development environment for AI, data science, and ML. It provides the familiar building blocks of classic Jupyter Notebook (notebooks, terminal, text editor, file browser, rich outputs) in a flexible UI. With GPU‑accelerated containers, you can train models and prototype solutions directly on your Jetson.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-9 hover-type-none"><img decoding="async" width="1024" height="497" title="jupyterlab_interface" src="https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-1024x497.png" alt class="img-responsive wp-image-1135" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-200x97.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-400x194.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-600x291.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-800x388.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/07/jupyterlab_interface-1200x582.png 1200w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-title title fusion-title-22 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-bottom:-30px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4> Requirements</h4></h4></div>
<div class="table-1">
<p>&nbsp;</p>
<table width="100%">
<thead>
<tr>
<th align="left">
<div>
<div>Hardware / Software</div>
</div>
</th>
<th align="left">
<div>
<div>Notes</div>
</div>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Jetson (Nano / Orin series)</strong> ≥ <strong>4 GB RAM</strong></td>
<td align="left"><strong>8 GB+</strong> recommended for larger notebooks and models</td>
</tr>
<tr>
<td align="left"><strong>NVMe SSD</strong></td>
<td align="left"><strong>Highly recommended</strong> for faster I/O and model storage (microSD works but slower)</td>
</tr>
<tr>
<td align="left"><strong>JetPack 5.0+ or 6.0+</strong></td>
<td align="left">Latest versions recommended for best container support</td>
</tr>
<tr>
<td align="left"><strong>NVIDIA Container Toolkit</strong></td>
<td align="left">Installed with Jetpack , if not you can install <a href="https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html">here .</a></td>
</tr>
<tr>
<td align="left"><strong>Docker</strong></td>
<td align="left">May require manual install on JP6</td>
</tr>
</tbody>
</table>
</div>
<div class="fusion-title title fusion-title-23 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4 id="stepbystep-setup" data-source-line="23">Step‑by‑Step Setup</h4></h4></div><div class="fusion-title title fusion-title-24 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4 id="1-verify-jetpack-installation" data-source-line="25">1. Verify JetPack Installation</h4></h4></div><div class="fusion-text fusion-text-62"><p>First, check your <strong>JetPack version</strong> and ensure <strong>Docker</strong> is working:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-29 > .CodeMirror, .fusion-syntax-highlighter-29 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-29 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_29" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_29" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_29" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Check JetPack version
cat /etc/nv_tegra_release

# Verify Docker and NVIDIA runtime
docker --version
docker run --rm --runtime nvidia hello-world</textarea></div><div class="fusion-text fusion-text-63"><p> </p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-10 hover-type-none"><img decoding="async" width="1700" height="1064" title="version" src="https://blog-en.openzeka.com/wp-content/uploads/2025/07/version.png" alt class="img-responsive wp-image-1132" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/07/version-300x188.png 300w, https://blog-en.openzeka.com/wp-content/uploads/2025/07/version.png 1700w" sizes="(max-width: 1700px) 100vw, 1700px" /></span></div><div class="fusion-title title fusion-title-25 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>2. Create a persistent workspace</h4></h4></div><div class="fusion-text fusion-text-64"><p>Use a dedicated workspace so <strong>notebooks/files</strong> persist outside the container:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-30 > .CodeMirror, .fusion-syntax-highlighter-30 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-30 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_30" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_30" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_30" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Create project directories
mkdir -p ~/projects
mkdir -p ~/datasets

# Create cache directory (adjust path based on your SSD mount)
# For NVMe SSD mounted at /mnt/nvme:
sudo mkdir -p /mnt/nvme/cache
sudo chown -R $USER:$USER /mnt/nvme/cache

mkdir -p ~/.jupyter</textarea></div><div class="fusion-title title fusion-title-26 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>3. Launch JupyterLab with GPU</h4></h4></div><div class="fusion-text fusion-text-65"><p>The dustynv/jupyterlab image provides JupyterLab on port <strong>8888</strong>. Note that newer JetPack versions use different container tags:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-31 > .CodeMirror, .fusion-syntax-highlighter-31 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-31 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_31" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_31" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_31" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># For JetPack 6.0+ (r36.x):
docker run -it --rm \
  --name=jupyterlab \
  --network=host \
  -e JUPYTER_PORT=8888 \
  -e JUPYTER_LOGS=/root/.cache/jupyter/jupyter.log \
  -v /mnt/nvme/cache/jupyter:/root/.cache/jupyter \
  -v /mnt/nvme/cache/jupyter/ipynb_checkpoints:/root/.ipynb_checkpoints \
  -v /mnt/nvme/cache/jupyter/ipython:/root/.ipython \
  -v /mnt/nvme/cache/jupyter/jupyter:/root/.jupyter \
  --runtime=nvidia \
  -e DOCKER_PULL=always --pull always \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  dustynv/jupyterlab:r36.4.0

# For JetPack 5.x (r35.x):
# dustynv/jupyterlab:r35.4.0</textarea></div><div class="fusion-title title fusion-title-27 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>4. Set your own password</h4></h4></div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-11 hover-type-none"><img decoding="async" width="1024" height="919" title="token" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-1024x919.png" alt class="img-responsive wp-image-1150" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-200x179.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-400x359.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-600x538.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-800x718.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/token-1200x1077.png 1200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/token.png 1518w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-text fusion-text-66" style="--awb-margin-top:20px;"><p>Your password is<strong> stored at</strong> ;</p>
<blockquote>
<p>/root/.jupyter/jupyter_server_config.json.</p>
</blockquote>
</div><div class="fusion-title title fusion-title-28 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>5. Access the Web UI</h4></h4></div><div class="fusion-text fusion-text-67"><p>Open your browser and navigate to:</p>
<blockquote>
<p>http://localhost:8888/lab (on the Jetson itself)<br />
http://:8888/lab (from another device on your LAN)</p>
</blockquote>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-12 hover-type-none"><img decoding="async" width="822" height="384" title="login" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/login.png" alt class="img-responsive wp-image-1153" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/login-200x93.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/login-400x187.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/login-600x280.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/login-800x374.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/login.png 822w" sizes="(max-width: 640px) 100vw, 822px" /></span></div><div class="fusion-text fusion-text-68"><p><strong>You can change your password like this:</strong></p>
<p>In JupyterLab → Terminal (or from host via docker exec):</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-32 > .CodeMirror, .fusion-syntax-highlighter-32 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-32 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_32" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_32" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_32" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Inside the running container
jupyter lab password</textarea></div><div class="fusion-title title fusion-title-29 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>6. Verify GPU access</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-33 > .CodeMirror, .fusion-syntax-highlighter-33 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-33 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_33" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_33" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_33" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">import torch
print("CUDA available:", torch.cuda.is_available())
print("CUDA device count:", torch.cuda.device_count())
if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("VRAM (GB):", round(torch.cuda.get_device_properties(0).total_memory / 1e9, 1))</textarea></div><div class="fusion-text fusion-text-69" style="--awb-margin-top:20px;"><p>Outside the container run:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-34 > .CodeMirror, .fusion-syntax-highlighter-34 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-34 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_34" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_34" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_34" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">jtop </textarea></div><div class="fusion-text fusion-text-70" style="--awb-margin-top:20px;"><p>to verify gpu usage.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-13 hover-type-none"><img decoding="async" width="1024" height="519" title="jtop" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-1024x519.png" alt class="img-responsive wp-image-1152" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-200x101.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-400x203.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-600x304.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-800x405.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/jtop-1200x608.png 1200w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-title title fusion-title-30 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>7. Advanced: Mount additional data sources</h4></h4></div><div class="fusion-text fusion-text-71"><p>For complex projects, mount additional host directories:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-35 > .CodeMirror, .fusion-syntax-highlighter-35 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-35 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_35" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_35" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_35" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run -it --rm \
  --name jupyterlab \
  --network=host \
  --runtime=nvidia \
  -v ~/projects:/workspace \
  -v ~/datasets:/workspace/datasets \
  -v /mnt/nvme/cache:/root/.cache \
  -v ~/.jupyter:/root/.jupyter \
  -v /data/experiments:/workspace/experiments \
  -v /media/usb:/workspace/usb_data \
  dustynv/jupyterlab:r36.4.0</textarea></div><div class="fusion-title title fusion-title-31 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Sample AI Workflows</h3></h3></div><div class="fusion-title title fusion-title-32 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Computer Vision Pipeline (GPU-accelerated):</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-36 > .CodeMirror, .fusion-syntax-highlighter-36 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-36 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_36" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_36" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_36" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">import torch
import torchvision.transforms as T
from PIL import Image
import requests
import time

# Verify GPU setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Load and preprocess image on GPU
url = "https://example.com/sample_image.jpg"
image = Image.open(requests.get(url, stream=True).raw)

# GPU-accelerated preprocessing
transform = T.Compose([
    T.Resize((224, 224)), 
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

start_time = time.time()
image_tensor = transform(image).unsqueeze(0).to(device)
print(f"Preprocessing completed in {time.time() - start_time:.3f}s on {device}")</textarea></div><div class="fusion-title title fusion-title-33 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Natural Language Processing:</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-37 > .CodeMirror, .fusion-syntax-highlighter-37 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-37 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_37" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_37" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_37" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">import torch
from transformers import pipeline

# Use GPU if available
device = 0 if torch.cuda.is_available() else -1
print(f"Using device: {'GPU' if device == 0 else 'CPU'}")

# Initialize pipeline
classifier = pipeline(
    "sentiment-analysis", 
    device=device,
    model="distilbert-base-uncased-finetuned-sst-2-english"
)

# Test with sample text
results = classifier([
    "I love building AI workflows in JupyterLab on Jetson!",
    "This GPU acceleration makes training so much faster."
])

for text, result in zip(["Text 1", "Text 2"], results):
    print(f"{text}: {result['label']} (confidence: {result['score']:.3f})")</textarea></div><div class="fusion-title title fusion-title-34 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Troubleshooting</h3></h3></div><div class="fusion-title title fusion-title-35 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Common Issues &amp; Solutions:</h4></h4></div><div class="fusion-text fusion-text-72"><p><strong>JupyterLab won&#8217;t start:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-38 > .CodeMirror, .fusion-syntax-highlighter-38 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-38 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_38" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_38" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_38" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Check if port is in use
sudo netstat -tulpn | grep 8888

# Try different port
docker run ... -p 8889:8888 ...

# Check container logs
docker logs jupyterlab</textarea></div><div class="fusion-text fusion-text-73" style="--awb-margin-top:20px;"><p><strong>Out of memory errors:</strong></p>
<ul>
<li>Reduce batch sizes</li>
<li>Use gradient checkpointing</li>
<li>Enable mixed precision training</li>
<li>Monitor with tegrastats</li>
</ul>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-14 hover-type-none"><img decoding="async" width="1024" height="159" title="tegrastats" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats-1024x159.png" alt class="img-responsive wp-image-1151" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats-200x31.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats-400x62.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats-600x93.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats-800x124.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/tegrastats.png 1122w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-text fusion-text-74" style="--awb-margin-top:20px;"><p><strong>Package installation failures:</strong></p>
<ul>
<li>Check Python version compatibility</li>
<li>For OpenCV, prefer JetPack&#8217;s optimized version</li>
<li>Use pip install &#8211;no-cache-dir for memory-constrained installs</li>
</ul>
</div><div class="fusion-title title fusion-title-36 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-bottom:-20px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Additional Resources</h4></h4></div><div class="fusion-text fusion-text-75"><ul>
<li><a href="https://www.jetson-ai-lab.com/"><strong style="color: #38c92e;">NVIDIA Jetson AI Lab</strong></a></li>
<li><a href="https://github.com/dusty-nv/jetson-containers"><strong style="color: #38c92e;">Jetson Containers Repository</strong></a></li>
<li><a href="https://jupyterlab.readthedocs.io/"><strong style="color: #38c92e;">JupyterLab Documentation</strong></a></li>
<li><a href="https://catalog.ngc.nvidia.com/"><strong style="color: #38c92e;">NVIDIA NGC Catalog</strong></a></li>
</ul>
<p><em>For the latest updates and community discussions, visit the NVIDIA Developer Forums.</em></p>
</div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-jupyterlab-development-environment/">Jetson Generative AI – JupyterLab Development Environment</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Jetson Generative AI – n8n Local Agents</title>
		<link>https://blog-en.openzeka.com/jetson-generative-ai-n8n-local-agents/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Thu, 07 Aug 2025 13:09:07 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1164</guid>

					<description><![CDATA[<p>n8n transforms your Jetson into an intelligent agent f ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-n8n-local-agents/">Jetson Generative AI – n8n Local Agents</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-8 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-9 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-text fusion-text-76"><p><strong>n8n</strong> transforms your Jetson into an intelligent agent factory with its visual workflow automation platform. This fair code licensed tool with over 123k GitHub stars (as of July) enables you to create autonomous agents that think, decide, and can act locally using local LLMs, Ollama or using external services. You can build RAG pipelines, sophisticated AI agents that can monitor systems, process data, make decisions, and execute actions by only using n8n workflow.</p>
</div><div class="fusion-image-element " style="--awb-aspect-ratio:4 / 3;--awb-margin-bottom:-140px;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-15 hover-type-none has-aspect-ratio"><img decoding="async" width="2830" height="1396" title="ollamarag" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/ollamarag.png" class="img-responsive wp-image-1169 img-with-aspect-ratio" data-parent-fit="cover" data-parent-container=".fusion-image-element" alt /></span></div><div class="fusion-title title fusion-title-37 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-bottom:-30px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Requirements</h4></h4></div>
<div class="table-1">
<p>&nbsp;</p>
<table width="100%">
<thead>
<tr>
<th align="left">Hardware / Software</th>
<th align="left">Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Any Jetson (Nano/Orin)</strong> ≥ <strong>4 GB RAM</strong></td>
<td align="left">16GB+ recommended for complex workflows</td>
</tr>
<tr>
<td align="left"><strong>NVMe SSD</strong></td>
<td align="left">Recommended for workflow data storage</td>
</tr>
</tbody>
</table>
</div>
<div class="fusion-title title fusion-title-38 fusion-sep-none fusion-title-text fusion-title-size-three" style="--awb-margin-bottom:-30px;"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Step-by-Step Setup</h3></h3></div><div class="fusion-title title fusion-title-39 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>1. Create Necessary Directories</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-39 > .CodeMirror, .fusion-syntax-highlighter-39 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-39 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_39" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_39" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_39" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">sudo mkdir -p /mnt/nvme/cache/n8n
sudo chown -R $USER:$USER /mnt/nvme/cache/n8n</textarea></div><div class="fusion-title title fusion-title-40 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>2. Launch n8n</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-40 > .CodeMirror, .fusion-syntax-highlighter-40 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-40 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_40" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_40" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_40" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker run -it --rm --name=n8n \
  --network=host \
  -e N8N_LISTEN_ADDRESS=0.0.0.0 \
  -e N8N_PORT=5678 \
  -e N8N_SECURE_COOKIE=false \
  -v /mnt/nvme/cache/n8n:/root/node/.n8n \
  -e DOCKER_PULL=always --pull always \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  n8nio/n8n:stable</textarea></div><div class="fusion-title title fusion-title-41 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>3. Access the Web Interface</h4></h4></div><div class="fusion-text fusion-text-77"><p>Once the container starts, you&#8217;ll see:</p>
<blockquote>
<p>Editor is now accessible via:<br />
http://localhost:5678</p>
</blockquote>
<ul>
<li>Local access: Open <strong>http://localhost:5678</strong> in your browser</li>
<li>Remote access: Use<strong> http://&lt;jetson-ip&gt;:5678</strong></li>
</ul>
</div><div class="fusion-title title fusion-title-42 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>4. Set up Ollama (Separate Container)</h4></h4></div><div class="fusion-text fusion-text-78"><p>In a new terminal, start Ollama:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-41 > .CodeMirror, .fusion-syntax-highlighter-41 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-41 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_41" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_41" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_41" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Create Ollama data directory
sudo mkdir -p /mnt/nvme/cache/ollama
sudo chown -R $USER:$USER /mnt/nvme/cache/ollama

# Run Ollama container
docker run -d --name=ollama \
  -p 11434:11434 \
  -v /mnt/nvme/cache/ollama:/root/.ollama \
  ollama/ollama</textarea></div><div class="fusion-title title fusion-title-43 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>5. Set up Ollama Models</h4></h4></div><div class="fusion-text fusion-text-79"><p>Pull AI models into your separate Ollama container:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-42 > .CodeMirror, .fusion-syntax-highlighter-42 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-42 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_42" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_42" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_42" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Pull popular models (choose based on your Jetson's RAM)
docker exec -it ollama ollama pull llama3.2:3b    # Smaller model (3B parameters)
docker exec -it ollama ollama pull llama3.1:8b    # Larger model (8B parameters) 
docker exec -it ollama ollama pull mistral:7b     # Alternative model

# Verify models are downloaded
docker exec -it ollama ollama list

# Test Ollama is responding
curl http://localhost:11434/api/tags</textarea></div><div class="fusion-text fusion-text-80" style="--awb-margin-top:20px;"><p><strong>Model Size Guide for Jetson:</strong></p>
<ul>
<li><strong>4-8GB RAM:</strong> Use 3B models (llama3.2:3b)</li>
<li><strong>16GB+ RAM</strong>: Can handle 7B-8B models</li>
<li><strong>32GB+ RAM</strong>: Can run multiple large models simultaneously</li>
</ul>
</div><div class="fusion-title title fusion-title-44 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-bottom:-20px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>6. Configure Ollama Connection in n8n</h4></h4></div><div class="fusion-text fusion-text-81"><ol>
<li><strong>In n8n workflows</strong>:
<ul>
<li>Add &#8220;Ollama Chat Model&#8221; node to any workflow</li>
<li>Set base URL to <code>http://localhost:11434</code></li>
<li>Select your pulled model from dropdown (e.g., llama3.2:3b)</li>
<li>Test the connection</li>
</ul>
</li>
<li><strong>Create credentials</strong> (if needed):
<ul>
<li>Go to Settings &gt; Credentials</li>
<li>Add &#8220;Ollama&#8221; credential</li>
<li>Base URL: <code>http://localhost:11434</code></li>
</ul>
</li>
</ol>
<p><strong>Note</strong>: Since both containers use <code>--network=host</code>, they can communicate via <code>localhost:11434</code></p>
</div><div class="fusion-title title fusion-title-45 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>7. Use external services</h4></h4></div><div class="fusion-text fusion-text-82"><p>If youre going to use an external service like OpenAI you can add the node and then add its credentials.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-16 hover-type-none"><img decoding="async" width="1024" height="537" title="n8n_credentials_setup" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-1024x537.png" alt class="img-responsive wp-image-1170" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-200x105.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-400x210.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-600x314.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-800x419.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_credentials_setup-1200x629.png 1200w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-title title fusion-title-46 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>8. Explore Community Workflows and Templates</h4></h4></div><div class="fusion-text fusion-text-83"><p><strong>n8n</strong> has a rich ecosystem of community-contributed workflows that you can use as starting points:</p>
<ul>
<li><strong>Official Template Gallery</strong>: Visit <a href="https://n8n.io/workflows">n8n.io/workflows</a> to browse 800+ workflow templates</li>
<li><strong>GitHub Community</strong>: Search GitHub for &#8220;n8n-workflow&#8221; to find community contributions</li>
<li><strong>Template Categories</strong>:
<ul>
<li>AI Agent Chat workflows</li>
<li>Content creation automation</li>
<li>Social media management</li>
<li>Email processing with AI</li>
<li>Data transformation pipelines</li>
<li>Slack/Discord bots</li>
</ul>
</li>
</ul>
<p><strong>How to use templates:</strong></p>
<ol>
<li>Browse templates at<b> <a style="color: #38c92e;" href="http://n8n.io/workflows">n8n.io/workflows</a></b></li>
<li>Click &#8220;Use for free&#8221; on any template</li>
<li>You can copy the JSON file and paste it into the workflow or save it as a JSON file and upload it</li>
<li>Customize nodes and credentials to match your setup</li>
</ol>
<p><strong>Popular AI templates to try:</strong></p>
<ul>
<li><strong><a style="color: #38c92e;" href="https://n8n.io/workflows/2861-ai-powered-email-processing-autoresponder-and-response-approval-yesno/">AI-powered email autoresponder</a></strong></li>
<li><strong><a style="color: #38c92e;" href="https://n8n.io/workflows/3135-automated-social-media-content-publishing-factory-system-prompt-composition/">Social media content generation</a></strong></li>
<li><strong><a style="color: #38c92e;" href="https://n8n.io/workflows/3442-fully-automated-ai-video-generation-and-multi-platform-publishing/">Automated video creation and publishing</a></strong></li>
<li><strong><a style="color: #38c92e;" href="https://n8n.io/workflows/3804-automated-pr-code-reviews-with-github-gpt-4-and-google-sheets-best-practices/">Automated code reviews with GPT-4</a></strong></li>
<li><strong><a style="color: #38c92e;" href="https://n8n.io/workflows/3123-automatic-reminders-for-follow-ups-with-ai-and-human-in-the-loop-gmail/">Follow-up reminders with AI</a></strong></li>
</ul>
</div><div class="fusion-title title fusion-title-47 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>9. Advanced agent features</h4></h4></div><div class="fusion-text fusion-text-84"><p data-source-line="147">You can created advanced workflows with n8n using:</p>
<ul data-source-line="149">
<li data-source-line="149"><strong>Interactive Chat Agents</strong>: Build conversational AI with Chat Trigger nodes for real-time user interactions</li>
<li data-source-line="150"><strong>File Processing Intelligence</strong>: Load schemas, extract data from files, and combine with AI for document analysis</li>
<li data-source-line="151"><strong>Memory Management</strong>: Use Window Buffer Memory to maintain conversation context across multiple interactions</li>
<li data-source-line="152"><strong>Multi-step reasoning</strong>: Chain multiple LLM operations for complex decisions and data processing</li>
<li data-source-line="153"><strong>Dynamic Data Combination</strong>: Merge schema data with chat inputs for context-aware responses</li>
<li data-source-line="154"><strong>Conditional logic</strong>: Route agent workflows based on AI-generated decisions and user inputs</li>
<li data-source-line="155"><strong>Error handling</strong>: Build robust failure recovery into agent behavior</li>
<li data-source-line="156"><strong>Local AI integration</strong>: Combine multiple local models for specialized tasks like SQL generation, data analysis</li>
</ul>
<p><em><strong>E.g Complex agent workflow to generate SQL queries from schema using local LLM reasoning ;</strong></em></p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-17 hover-type-none"><img decoding="async" width="2864" height="1372" title="n8n_advanced_workflow" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/n8n_advanced_workflow.png" alt class="img-responsive wp-image-1171"/></span></div><div class="fusion-title title fusion-title-48 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Troubleshooting &amp; Common Issues</h3></h3></div><div class="fusion-title title fusion-title-49 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>&#8220;Model not found&#8221; in Ollama node</h4></h4></div><div class="fusion-text fusion-text-85"><p>This error occurs when you try to use a model in n8n that hasn&#8217;t been downloaded to your Ollama container yet.</p>
<p><strong>Solution: Download the model first by running:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-43 > .CodeMirror, .fusion-syntax-highlighter-43 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-43 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_43" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_43" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_43" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker exec -it ollama ollama pull [model-name]</textarea></div><div class="fusion-text fusion-text-86"><p>For example, to pull Llama 3.2 3B model:</p>
<p>&nbsp;</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-44 > .CodeMirror, .fusion-syntax-highlighter-44 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-44 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_44" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_44" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_44" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker exec -it ollama ollama pull llama3.2:3b</textarea></div><div class="fusion-title title fusion-title-50 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Connection refused</h4></h4></div><div class="fusion-text fusion-text-87"><p>This happens when n8n cannot connect to the Ollama service, usually due to network configuration issues.</p>
<p><strong>Solutions:</strong></p>
<ul>
<li>Ensure both containers are running with <code>--network=host</code> flag</li>
<li>Verify Ollama is accessible by testing: <code>curl http://localhost:11434/api/tags</code></li>
<li>For Mac users, use <code>host.docker.internal:11434</code> instead of <code>localhost:11434</code></li>
<li>Check if Ollama container is running: <code>docker ps</code></li>
</ul>
</div><div class="fusion-title title fusion-title-51 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>&#8220;Pull model manifest: 412&#8221; error</h4></h4></div><div class="fusion-text fusion-text-88"><p>This error typically occurs when using outdated Docker images or when there are authentication issues with model repositories.</p>
<p><strong>Solutions:</strong></p>
<ul>
<li>Update to the latest Docker images:</li>
</ul>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-45 > .CodeMirror, .fusion-syntax-highlighter-45 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-45 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_45" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_45" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_45" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">docker pull n8nio/n8n:stable
docker pull ollama/ollama:latest</textarea></div><div class="fusion-text fusion-text-89"><ul>
<li>Clear Docker cache and restart containers</li>
<li>Check your internet connection and firewall settings</li>
<li>Verify the model name is correct and still available in the Ollama library</li>
</ul>
<p><em>For comprehensive guides and documentation, visit <strong><a href="https://docs.n8n.io"><span style="color: #38c92e;">docs.n8n.io</span>.</a></strong> The platform offers extensive customization options and community support for automation workflows.</em></p>
</div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-n8n-local-agents/">Jetson Generative AI – n8n Local Agents</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Jetson Generative AI &#8211; Flowise</title>
		<link>https://blog-en.openzeka.com/jetson-generative-ai-flowise/</link>
		
		<dc:creator><![CDATA[Enhar]]></dc:creator>
		<pubDate>Thu, 07 Aug 2025 13:08:45 +0000</pubDate>
				<category><![CDATA[Generative AI]]></category>
		<guid isPermaLink="false">https://blog.aetherix.com/?p=1187</guid>

					<description><![CDATA[<p>Flowise is an open-source, low-code tool for building  ... Continue Reading→</p>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-flowise/">Jetson Generative AI &#8211; Flowise</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="fusion-fullwidth fullwidth-box fusion-builder-row-9 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling" style="--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;" ><div class="fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap" style="max-width:1331.2px;margin-left: calc(-4% / 2 );margin-right: calc(-4% / 2 );"><div class="fusion-layout-column fusion_builder_column fusion-builder-column-10 fusion_builder_column_1_1 1_1 fusion-flex-column" style="--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;"><div class="fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column"><div class="fusion-text fusion-text-90"><p><strong>Flowise</strong> is an open-source, low-code tool for building customized LLM applications and AI agents. Flowise is designed to let anyone build powerful AI-driven solutions without writing a single line of code!</p>
</div><div class="fusion-title title fusion-title-52 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-top:-10px;--awb-margin-bottom:-10px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Why Flowise on Jetson?</h4></h4></div><div class="fusion-title title fusion-title-53 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Flowise vs n8n: Choosing the Right Tool</h4></h4></div><div class="fusion-text fusion-text-91"><p>At first glance, Flowise and n8n might seem pretty similar &#8211; both offer visual workflow builders and can handle AI tasks. But when you dig deeper, they each have their own advantages and disadvantages that make them better suited for different types of projects.</p>
</div><div class="fusion-text fusion-text-92"><p><strong>Core Philosophy Difference</strong><br />
<strong>Flowise:</strong> AI-first platform specializing in LLM workflows<br />
<strong>n8n:</strong> General-purpose automation platform that can do AI</p>
</div>
<div class="table-1">
<table width="100%">
<thead>
<tr>
<th align="left">Capability</th>
<th align="left">Flowise</th>
<th align="left">n8n</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>AI Focus</strong></td>
<td align="left">LLM-optimized with built-in prompt engineering</td>
<td align="left">300+ integrations, AI as add-on</td>
</tr>
<tr>
<td align="left"><strong>Learning Curve</strong></td>
<td align="left">Easier for AI newcomers</td>
<td align="left">Better for developers</td>
</tr>
<tr>
<td align="left"><strong>Code Execution</strong></td>
<td align="left">Limited, component-focused</td>
<td align="left">Full JavaScript/Python support</td>
</tr>
<tr>
<td align="left"><strong>Performance</strong></td>
<td align="left">Good for moderate AI workloads</td>
<td align="left">Enterprise-grade with scaling</td>
</tr>
<tr>
<td align="left"><strong>Templates</strong></td>
<td align="left">AI-focused (RAG, Agents, Research)</td>
<td align="left">800+ general automation templates</td>
</tr>
</tbody>
</table>
</div>
<div class="fusion-title title fusion-title-54 fusion-sep-none fusion-title-text fusion-title-size-one"><h1 class="fusion-title-heading title-heading-left" style="margin:0;"><h4 id="choose-flowise-when-you" data-source-line="27">Choose Flowise When You:</h4></h1></div><div class="fusion-text fusion-text-93"><p><strong>Want rapid AI prototyping</strong> &#8211; Get chatbots running in minutes<br />
<strong>Focus on conversational AI &#8211;</strong> Built-in conversation management<br />
<strong>Need AI-specific tools &#8211;</strong> Native LangChain/LlamaIndex integration<br />
<strong>Prefer simplicity &#8211;</strong> Lower learning curve for LLM projects</p>
</div><div class="fusion-title title fusion-title-55 fusion-sep-none fusion-title-text fusion-title-size-one"><h1 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Choose n8n When You:</h4></h1></div><div class="fusion-text fusion-text-94"><p><strong>Need enterprise integrations</strong> &#8211; Connect AI with existing business systems<br />
<strong>Want code flexibility</strong> &#8211; Custom JavaScript/Python execution<br />
<strong>Require complex workflows</strong> &#8211; Advanced scraping, data processing<br />
<strong>Build beyond AI</strong> &#8211; General automation across multiple systems</p>
</div><div class="fusion-title title fusion-title-56 fusion-sep-none fusion-title-text fusion-title-size-one"><h1 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>For the Jetson:</h4></h1></div><div class="fusion-text fusion-text-95"><p>For <strong>edge AI applications</strong>, Flowise&#8217;s specialization often wins because:</p>
<ul>
<li><strong>Local LLM integration</strong> is seamless (Ollama support)</li>
<li><strong>Rapid iteration</strong> matters more than complex integrations</li>
<li><strong>AI-first design</strong> reduces development complexity</li>
<li><strong>Template marketplace</strong> accelerates deployment</li>
</ul>
<p>If your Jetson project is primarily about AI/LLM applications, Flowise gets you there faster. If you need AI as part of broader system automation, n8n provides more flexibility.</p>
</div><div class="fusion-title title fusion-title-57 fusion-sep-none fusion-title-text fusion-title-size-one"><h1 class="fusion-title-heading title-heading-left" style="margin:0;"><h4 id="installation-methods" data-source-line="70">Installation Methods</h4></h1></div><div class="fusion-text fusion-text-96"><p>You have a few different ways to get Flowise running on your Jetson. Docker installation is probably your best bet since it&#8217;s the most straightforward and handles all the dependencies for you. If you prefer more control, you can do a local Node.js installation. There&#8217;s also the option of cloud deployment if you need external access, though that defeats some of the edge computing benefits.</p>
</div><div class="fusion-title title fusion-title-58 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Docker Installation</h4></h4></div><div class="fusion-text fusion-text-97"><p><em>Note: Ensure your Jetson has Docker installed. For more information visit <strong>https://docs.docker.com/engine/install/</strong> .</em></p>
</div><div class="fusion-text fusion-text-98"><p><strong>Web UI</strong>: http://localhost:3000</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-46 > .CodeMirror, .fusion-syntax-highlighter-46 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-46 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_46" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_46" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_46" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Pull and run Flowise
docker run -it --rm \
  --name=flowise \
  --network=host \
  -e PORT=3000 \
  -e FLOWISE_USERNAME=jetson \
  -e FLOWISE_PASSWORD=jetson \
  -v /mnt/nvme/cache/flowise:/root/.flowise \
  -e DATABASE_PATH=/root/.flowise \
  -e APIKEY_PATH=/root/.flowise \
  -e SECRETKEY_PATH=/root/.flowise \
  -e LOG_PATH=/root/.flowise/logs \
  -e BLOB_STORAGE_PATH=/root/.flowise/storage \
  --runtime nvidia \
  flowiseai/flowise:latest</textarea></div><div class="fusion-title title fusion-title-59 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Local Installation</h4></h4></div><div class="fusion-text fusion-text-99"><p><em>Note: Ensure Node.js is installed on your Jetson (Node v18.15.0 or v20 is supported). </em></p>
<p><strong>1. Install Node.js on Jetson:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-47 > .CodeMirror, .fusion-syntax-highlighter-47 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-47 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_47" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_47" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_47" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Install Node.js 20 LTS (recommended)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Verify installation (should be v18.15.0+ or v20+)
node --version
npm --version</textarea></div><div class="fusion-title title fusion-title-60 fusion-sep-none fusion-title-text fusion-title-size-one"><h1 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Quick Start</h4></h1></div><div class="fusion-text fusion-text-100"><p><strong>2 .Install Flowise:</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-48 > .CodeMirror, .fusion-syntax-highlighter-48 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-48 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_48" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_48" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_48" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Install Flowise globally
npm install -g flowise

# Or install a specific version
npm install -g flowise@x.x.x</textarea></div><div class="fusion-text fusion-text-101" style="--awb-margin-top:20px;"><p><strong>3 .Start Flowise:<br />
</strong></p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-49 > .CodeMirror, .fusion-syntax-highlighter-49 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-49 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_49" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_49" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_49" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">npx flowise start</textarea></div><div class="fusion-text fusion-text-102" style="--awb-margin-top:20px;"><p>Open: <strong>http://localhost:3000</strong></p>
</div><div class="fusion-title title fusion-title-61 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-top:-20px;--awb-margin-bottom:-20px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Getting Started</h4></h4></div><div class="fusion-title title fusion-title-62 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-bottom:-20px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>First Setup</h4></h4></div><div class="fusion-text fusion-text-103"><ol>
<li><strong>Access Flowise</strong>:
<ul>
<li>Open browser to <code>http://your-jetson-ip:3000</code></li>
<li>Login with configured credentials</li>
</ul>
</li>
<li><strong>Configure API Keys</strong>:
<ul>
<li>Go to &#8220;Credentials&#8221; section</li>
<li>Add your LLM provider API keys (OpenAI, Anthropic, etc.)</li>
</ul>
</li>
</ol>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-18 hover-type-none"><img decoding="async" width="1" height="1" title="credentials" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/credentials.png" alt class="img-responsive wp-image-1195"/></span></div><div class="fusion-title title fusion-title-63 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Flowise Management Features</h4></h4></div><div class="fusion-text fusion-text-104"><p>Once you&#8217;re inside Flowise, you&#8217;ll find all the management tools neatly organized in the sidebar. The interface is pretty intuitive &#8211; you&#8217;ve got your Chatflows for basic conversational AI, Agentflows for more complex multi-agent setups, and a Marketplace full of templates from the community.</p>
<p>There are separate sections for Tools and utility functions, Assistants for managing your deployed AI instances, and Executions where you can see how all your workflows are performing. The Document Stores section is where you upload files for your knowledge base, while API Keys and Credentials handle all your authentication securely.</p>
<p>What&#8217;s really nice is how everything ties together. You can track execution performance across all your workflows, manage variables globally so you don&#8217;t have to configure the same things over and over, and organize your documents with proper version control. Plus, all your API keys and secrets are encrypted and stored securely.</p>
</div><div class="fusion-title title fusion-title-64 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Building Advanced Agent Workflows</h4></h4></div><div class="fusion-text fusion-text-105"><p>For more complex applications, Flowise also supports advanced agent workflows that can handle multi-step reasoning and autonomous decision-making:</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-19 hover-type-none"><img decoding="async" width="1024" height="494" title="cred" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-1024x494.webp" alt class="img-responsive wp-image-1240" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-200x96.webp 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-300x145.webp 300w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-400x193.webp 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-600x289.webp 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-768x370.webp 768w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-800x386.webp 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred-1024x494.webp 1024w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/cred.webp 1049w" sizes="(max-width: 640px) 100vw, 1024px" /></span></div><div class="fusion-text fusion-text-106" style="--awb-margin-top:20px;"><p><em>Main Flowise dashboard showing sidebar navigation, management options, and credentials setup.</em></p>
</div><div class="fusion-title title fusion-title-65 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Local LLM Setup (Recommended for Jetson)</h4></h4></div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-50 > .CodeMirror, .fusion-syntax-highlighter-50 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-50 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_50" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_50" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_50" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Create Ollama data directory
sudo mkdir -p /mnt/nvme/cache/ollama
sudo chown -R $USER:$USER /mnt/nvme/cache/ollama

# Run Ollama container
docker run -d --name=ollama \
  -p 11434:11434 \
  -v /mnt/nvme/cache/ollama:/root/.ollama \
  ollama/ollama</textarea></div><div class="fusion-text fusion-text-107" style="--awb-margin-top:20px;"><p>Pull AI models into your separate Ollama container:</p>
<blockquote>
<p>Note: Choose models based on your Jetson&#8217;s RAM &#8211; use 3B models for 4-8GB RAM, 7B-8B models for 16GB+ RAM.</p>
</blockquote>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-51 > .CodeMirror, .fusion-syntax-highlighter-51 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-51 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_51" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_51" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_51" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh"># Pull popular models
docker exec -it ollama ollama pull llama3.2:3b    # Smaller model (3B parameters)
docker exec -it ollama ollama pull llama3.1:8b    # Larger model (8B parameters) 
docker exec -it ollama ollama pull mistral:7b     # Alternative model

# Verify models are downloaded
docker exec -it ollama ollama list

# Test Ollama is responding
curl http://localhost:11434/api/tags</textarea></div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-20 hover-type-none"><img decoding="async" width="806" height="901" title="add ollama" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama.png" alt class="img-responsive wp-image-1201" srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama-200x224.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama-400x447.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama-600x671.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama-800x894.png 800w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/add-ollama.png 806w" sizes="(max-width: 640px) 100vw, 806px" /></span></div><div class="fusion-title title fusion-title-66 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Template Marketplace</h4></h4></div><div class="fusion-title title fusion-title-67 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Pre-Built Templates for Quick Start</h4></h4></div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-21 hover-type-none"><img decoding="async" width="1" height="1" title="marketplace" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/marketplace.png" alt class="img-responsive wp-image-1204"/></span></div><div class="fusion-text fusion-text-108" style="--awb-margin-top:20px;"><p><strong>Access Marketplace:</strong> Click &#8220;Marketplaces&#8221; in the sidebar and use filters to find relevant templates. Click on the template you want to use and select &#8220;Use Template&#8221;. Modify nodes and connections for your specific needs.</p>
</div><div class="fusion-title title fusion-title-68 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Building Your First Chatbot</h3></h3></div><div class="fusion-text fusion-text-109"><p id="you-can-start-from-template-recommended"><strong>You can start from Template (Recommended)</strong></p>
<ol>
<li><strong>Navigate to Marketplace</strong>: Go to &#8220;Marketplaces&#8221; → &#8220;Community Templates&#8221;</li>
<li><strong>Select Template</strong>: Choose &#8220;Basic&#8221; or &#8220;Customer Support&#8221; template</li>
<li><strong>Import and Customize</strong>: Click template → modify for your needs</li>
<li><strong>Configure Ollama</strong>: Update Chat Model nodes to use <code>http://localhost:11434</code></li>
</ol>
<p id="or-you-can-build-from-scratch"><strong>Or you can build from scratch</strong></p>
<h4 id="step-1-create-new-chatflow">Step 1: Create New Chatflow</h4>
<ol>
<li>Go to &#8220;Chatflows&#8221; tab</li>
<li>Click &#8220;Add New&#8221;</li>
<li>Name your chatflow</li>
</ol>
<p><strong>And you can add nodes from Categories</strong></p>
<p>When you&#8217;re building workflows, you&#8217;ll see different node categories organized logically. LangChain nodes give you the core components and chains, while LlamaIndex nodes are specifically for RAG applications. There are Utility nodes for helper functions, Agent nodes for autonomous components, and Chat Model nodes that connect to different LLMs like Ollama or OpenAI. You&#8217;ll also find Document Loaders for processing files and Embedding nodes for vector operations.</p>
</div><div class="fusion-image-element " style="--awb-aspect-ratio:4 / 3;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-22 hover-type-none has-aspect-ratio"><img decoding="async" width="755" height="1024" title="addnodesinchatflow" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/addnodesinchatflow-755x1024.png" class="img-responsive wp-image-1202 img-with-aspect-ratio" data-parent-fit="cover" data-parent-container=".fusion-image-element" alt srcset="https://blog-en.openzeka.com/wp-content/uploads/2025/08/addnodesinchatflow-200x271.png 200w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/addnodesinchatflow-400x543.png 400w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/addnodesinchatflow-600x814.png 600w, https://blog-en.openzeka.com/wp-content/uploads/2025/08/addnodesinchatflow.png 762w" sizes="(max-width: 640px) 100vw, 755px" /></span></div><div class="fusion-title title fusion-title-69 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Configure Components</h4></h4></div><div class="fusion-text fusion-text-110"><p>When configuring your Chat Ollama node, set the model name to something like llama3.2:3b, keep the temperature around 0.7 for good creativity, and point the Base URL to <strong><em>http://localhost:11434</em></strong> where your Ollama container is running.</p>
<p>For the Conversation Chain, you&#8217;ll connect your Chat Model input and add a system prompt to give your AI some context about what it should do. Here&#8217;s a good starting prompt:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-52 > .CodeMirror, .fusion-syntax-highlighter-52 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-52 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_52" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_52" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_52" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next">You are a helpful AI assistant running on an NVIDIA Jetson device. 
Help with AI, robotics, and edge computing questions.
Keep responses concise and practical.</textarea></div><div class="fusion-title title fusion-title-70 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Test and Deploy</h4></h4></div><div class="fusion-text fusion-text-111"><p>Once you&#8217;ve got everything configured, hit the save button and use the chat panel to test your bot. Make sure it&#8217;s responding the way you want, then you can use Flowise&#8217;s deployment options to get it ready for production use.</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-23 hover-type-none"><img decoding="async" title="olllama_wokrflow" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/olllama_wokrflow.avif" alt class="img-responsive wp-image-1192"/></span></div><div class="fusion-title title fusion-title-71 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Deployment</h4></h4></div><div class="fusion-text fusion-text-112"><p>There are 5 options for deployment (<strong>&lt;/&gt; button</strong>). Flowise makes it really easy to embed your chatbots into websites. You can choose from a popup widget that floats on your page, a fullpage dedicated chat interface, or if you&#8217;re using React, there are specific React components for both popup and fullpage implementations.</p>
<p>The embedding code is pretty straightforward &#8211; just import the Flowise embed script and initialize it with your chatflow ID:</p>
</div><div class="fusion-image-element " style="--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);"><span class=" fusion-imageframe imageframe-none imageframe-24 hover-type-none"><img decoding="async" width="1" height="1" title="embedchatfloworagentflow" src="https://blog-en.openzeka.com/wp-content/uploads/2025/08/embedchatfloworagentflow.png" alt class="img-responsive wp-image-1221"/></span></div><div class="fusion-text fusion-text-113" style="--awb-margin-top:20px;"><p>You also get some nice advanced features like direct public links for sharing, custom authentication if you need access controls, theme customization to match your brand, and custom JavaScript event handling for more complex integrations.</p>
</div><div class="fusion-title title fusion-title-72 fusion-sep-none fusion-title-text fusion-title-size-three"><h3 class="fusion-title-heading title-heading-left" style="margin:0;"><h3>Troubleshooting</h3></h3></div><div class="fusion-text fusion-text-114"><h3 id="common-issues-you-might-run-into">Common Issues You Might Run Into</h3>
<p><strong>Node.js Version Problems:</strong> If you&#8217;re getting errors about Node.js being too old, you&#8217;ll need to update it. Run these commands to get a newer version:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-53 > .CodeMirror, .fusion-syntax-highlighter-53 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-53 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_53" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_53" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_53" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
sudo apt-get install -y nodejs</textarea></div><div class="fusion-text fusion-text-115" style="--awb-margin-top:10px;"><p><strong>Memory Issues:</strong> Running out of memory with &#8220;JavaScript heap out of memory&#8221; errors? Increase the memory limit before starting Flowise:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-54 > .CodeMirror, .fusion-syntax-highlighter-54 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-54 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_54" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_54" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_54" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">export NODE_OPTIONS="--max-old-space-size=4096"
flowise start</textarea></div><div class="fusion-text fusion-text-116" style="--awb-margin-top:10px;"><p><strong>GPU Not Working:</strong> If your GPU isn&#8217;t being detected, check the status with tegrastats and make sure Docker has the NVIDIA runtime enabled by restarting the service: sudo systemctl restart docker</p>
<p><strong>Port Conflicts:</strong> If port 3000 is already in use, find what&#8217;s using it with sudo lsof -i :3000, kill that process with sudo kill -9 , or just start Flowise on a different port: flowise start &#8211;PORT=3001</p>
<p><strong>Permission Problems:</strong> File permission issues? Fix them with:</p>
</div><style type="text/css" scopped="scopped">.fusion-syntax-highlighter-55 > .CodeMirror, .fusion-syntax-highlighter-55 > .CodeMirror .CodeMirror-gutters {background-color:#2d3748;}</style><div class="fusion-syntax-highlighter-container fusion-syntax-highlighter-55 fusion-syntax-highlighter-theme-dark" style="opacity:0;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;font-size:14px;border-width:1px;border-style:solid;border-color:rgba(242,243,245,0);"><div class="syntax-highlighter-copy-code"><span class="syntax-highlighter-copy-code-title" data-id="fusion_syntax_highlighter_55" style="font-size:14px;">Copy to Clipboard</span></div><label for="fusion_syntax_highlighter_55" class="screen-reader-text">Syntax Highlighter</label><textarea class="fusion-syntax-highlighter-textarea" id="fusion_syntax_highlighter_55" data-readOnly="nocursor" data-lineNumbers="" data-lineWrapping="" data-theme="oceanic-next" data-mode="text/x-sh">sudo chown -R $USER:$USER ~/.flowise
chmod -R 755 ~/.flowise</textarea></div><div class="fusion-title title fusion-title-73 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Docker Issues</h4></h4></div><div class="fusion-text fusion-text-117"><p>If you&#8217;re running Flowise in Docker and having problems, here are some quick fixes:</p>
<p>Check what&#8217;s happening with <code>docker logs flowise</code>, restart the container with <code>docker restart flowise</code>, or see all container statuses with <code>docker ps -a</code>. If things are really broken, you can remove the container completely with <code>docker rm -f flowise</code> and recreate it with your original docker run command.</p>
</div><div class="fusion-title title fusion-title-74 fusion-sep-none fusion-title-text fusion-title-size-four"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4 id="performance-problems" data-source-line="311">Performance Problems</h4></h4></div><div class="fusion-text fusion-text-118"><p><strong>Slow responses?</strong> Try using local models instead of API calls, reduce the context window size, or optimize your RAG chunk sizes if you&#8217;re doing document processing.</p>
<p><strong>Memory usage too high?</strong> Reduce the buffer window memory size, switch to smaller models, or clear your browser cache which sometimes helps.</p>
</div><div class="fusion-title title fusion-title-75 fusion-sep-none fusion-title-text fusion-title-size-four" style="--awb-margin-top:-10px;"><h4 class="fusion-title-heading title-heading-left" style="margin:0;"><h4>Getting Help</h4></h4></div><div class="fusion-text fusion-text-119"><p>If you&#8217;re stuck, the Flowise community is pretty active. Check out the <a style="color: #38c92e;" href="https://github.com/FlowiseAI/Flowise/issues">GitHub Issues</a> for bug reports and feature requests. The <a style="color: #38c92e;" href="https://docs.flowiseai.com/">official documentation</a> is also quite comprehensive. For Jetson-specific issues, the <a style="color: #38c92e;" href="https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/70">NVIDIA Developer Forums</a> are your best bet.</p>
</div></div></div></div></div>
<p>The post <a href="https://blog-en.openzeka.com/jetson-generative-ai-flowise/">Jetson Generative AI &#8211; Flowise</a> appeared first on <a href="https://blog-en.openzeka.com">OpenZeka EN Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
