AI Cortexo Blog

Latest from Medium

Medium

Automating Browser Tasks with LangChain, Ollama, and Async Python

By Muhammad Ishaq · 2 min read · May 6, 2025 · Medium

How I combined local LLMs, LangChain, and browser automation to build an AI agent that contextually navigates and extracts web data using Python.

Watch Demo Video on LinkedIn

In a recent side project, I experimented with combining local LLMs using Ollama, LangChain, ChatOllama, and a browser automation agent to perform intelligent tasks on the internet — all through Python.

The goal? Have an AI agent navigate to DuckDuckGo, search for "quantum computing in 2025", and extract the search results. Not just as a static scraper, but one that understands the task contextually, powered by an LLM.

🔧 The Setup

Ollama for running a local LLM (I used llama2:7b)
LangChain to abstract the LLM chat interface
A browser agent (from a custom module browser_use.Agent) for navigation and control
Python's asyncio for running everything asynchronously

from langchain_ollama import ChatOllama
from browser_use import Agent
import asyncio

async def run_search():
    agent = Agent(
        task=(
            "Go to https://duckduckgo.com/, "
            "and search for quantum computing in 2025 and capture the results"
        ),
        llm=ChatOllama(
            model='llama2:7b',
            num_ctx=128000,
        ),
        max_actions_per_step=1,
        tool_call_in_content=False,
    )
    await agent.run()

import nest_asyncio
nest_asyncio.apply()
await run_search()

🤖 What the Agent Actually Does

This is not a simple scraper that hardcodes HTML tags. The agent uses an LLM to reason about the task, navigate a browser (likely headless), and interact with elements based on their meaning.

For example, when instructed to "search for quantum computing in 2025", it understands what "search" means and knows how to interact with DuckDuckGo's UI to get it done.

The power here is that the code doesn't need to know how the page is structured. It just delegates that reasoning to the model.

🧠 Why This Matters

We're moving towards a future where automating browser tasks won't require brittle scripts or hardcoded element locators. With tools like LangChain and Ollama, intelligent agents can interpret goals and act accordingly — bridging the gap between static scripts and true task understanding.

This has applications beyond search engines — think product scraping, data monitoring, even form submissions or customer support automation.

🏁 Final Thoughts

If you're experimenting with local LLMs, LangChain's ChatOllama is a great way to bring that power into practical workflows. Combined with a browser automation agent, the possibilities open up dramatically.

This is just scratching the surface. Next up, I'm planning to add a feedback loop so the agent can improve its interactions over time — or maybe explore how this integrates with voice input and TTS.

Stay tuned!

Medium

Deploying DeepSeek‑R1 on Hugging Face Spaces

By Muhammad Ishaq · 2 min read · Apr 19, 2025 · Medium

How to turn DeepSeek‑R1 into a live, web‑accessible service by deploying it on Hugging Face Spaces.

Overview

DeepSeek‑R1 is an instruct‑tuned large language model specialized in deep code analysis and bug resolution. In this post, we'll walk through turning it into a live, web‑accessible service by deploying it on Hugging Face Spaces.

1. Pick Your Inference Strategy

A) Hosted Inference API (recommended for free CPU Spaces)

Pros: No model‑loading in your container, instant scaling, works on the free tier

Cons: Requires an HF Inference API token, latency depends on network

from huggingface_hub import InferenceClient

client = InferenceClient("deepseek‑ai/DeepSeek‑R1")

def deepseek_review(code_snippet: str) -> str:
    prompt = f"Analyze this code and summarize issues in 30 words:\n\n{code_snippet}"
    resp = client.text_generation(prompt=prompt, max_new_tokens=200)
    return resp[0]["generated_text"]

B) Build Your Space's App

from flask import Flask, request, jsonify
from huggingface_hub import InferenceClient

app = Flask(__name__)
client = InferenceClient("deepseek‑ai/DeepSeek‑R1")

@app.route("/review", methods=["POST"])
def review():
    data = request.json or {}
    code = data.get("code_snippet", "")
    if not code:
        return jsonify({"error": "No code provided"}), 400

    analysis = client.text_generation(
        prompt=f"Review this code in 30 words:\n\n{code}",
        max_new_tokens=200
    )
    return jsonify({"deepseek_analysis": analysis[0]["generated_text"]})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=7860)

C) Dockerize for Spaces

FROM python:3.10-slim

WORKDIR /app
COPY . /app

RUN pip install --upgrade pip \
 && pip install flask huggingface_hub

EXPOSE 7860
CMD ["python", "app.py"]

D) Publish and Test

Push your repo to GitHub (or directly to Hugging Face).
Create a new Space on Hugging Face, select Docker runtime, link your repo.
Deploy: Spaces will build your Dockerfile and launch your API.
Test with curl or in the browser:

curl -X POST https://<your‑space>.hf.space/review \
  -H "Content-Type: application/json" \
  -d '{"code_snippet":"def add(a,b):return a+b"}'

Limitations & Next Steps

Free Spaces: CPU only, network‑based inference; rate‑limited
Paid GPU Spaces: Local model loading possible but requires a beefy GPU.
Extending: Add a Gradio frontend or integrate with VSCode for instant code‑to‑AI feedback.

Conclusion

Deploying DeepSeek‑R1 on Hugging Face Spaces transforms your code‑review workflow into an on‑demand AI service. Whether you use the hosted API for quick testing or spin up a GPU‑enabled Space for full local inference, you now have a blueprint to deliver deep code insights in minutes.

Happy deploying!

Medium

🚀 Building an AI-Powered Code Reviewer: Flask + DeepSeek + LangChain

By Muhammad Ishaq · 6 min read · Apr 18, 2025 · Medium

How to build a lightweight AI-powered code review system using LangChain, DeepSeek, and Flask.

Code reviews are a cornerstone of software quality, but what if you could supercharge your review process with AI?

In this post, I'll walk you through building a lightweight AI-powered code review system using:

LangChain for prompting
DeepSeek (or any large-code-model) for deep analysis
Flask for exposing it as an API

💻 The Code Setup:

We need to import our deepseek model r1, before we do that, the model weights should be downloaded and present on your computer so that it would be easier to import the model locally. To do that we need to install ollama first on our local machine:

1- You can find the link here:

https://ollama.com/download/windows

Step 1: Install Ollama

Go to the official Ollama website and download the installer for your platform (Windows in your case).
Run the installer. It sets up everything, including GPU support (if compatible drivers are installed).

Step 2: Open Terminal / Command Prompt

Once installed, open Command Prompt or Windows Terminal.

Step 3: Pull & Run the DeepSeek R1 Model

Use the following command to download and start the model:

ollama run deepseek-r1:1.5b

Pull the DeepSeek R1 (1.5B) model
Set up a local instance
Start an interactive chat

You'll now see a blinking prompt where you can begin chatting with the model.

Now we need to import the deepseek-r1 model in our python jupyter notebook:

llm = OllamaLLM(model="deepseek-r1:1.5b")

But for this you must have the below dependencies installed:

from langchain.chains import SequentialChain, LLMChain
from langchain.prompts import PromptTemplate
from langchain_ollama.llms import OllamaLLM

Now we are going to build the prompts for light review and another for deep inspection.

from langchain import PromptTemplate

code_review_prompt = PromptTemplate(
    input_variables=["code_snippet"],
    template="""
    You are an AI system designed for code review.
    - Analyze the code for bugs, performance issues, and logical errors.
    - Flag potential risks and suggest if DeepSeek review is necessary.

    Code to Review:
    {code_snippet}

    Review Result:
    """
)
deepseek_prompt = PromptTemplate(
    input_variables=["flagged_code", "issues_summary"],
    template="""
    You are DeepSeek, an advanced AI specialized in deep code analysis.
    Code: {flagged_code}

    Issues Summary:
    {issues_summary}

    Tasks:
    1. Analyze the issues.
    2. Suggest corrected and optimized code.
    3. Explain your reasoning.

    DeepSeek Analysis:
    """
)

Step 4: Create chain using langchain

Define llm chain for each agent; In our case we have 2 agents:

code_review_chain = LLMChain(llm=llm, prompt=code_review_prompt, output_key="issues_summary")
deepseek_chain = LLMChain(llm=llm, prompt=deepseek_prompt, output_key="deepseek_analysis")

Step 5: Test the chain with code snippet or any bug

Currently we are passing a correct line of code snippet, you can pass any wrong code or any programming language line of code which contains syntax, logical errors.

code_snippet="""
def add_numbers(a, b):
    # This function adds two numbers
    return a + b
"""
deepseek_result = deepseek_chain.run({
    "flagged_code": code_snippet,
    "issues_summary": review_result
})
print("=== Code Review Summary ===\n", review_result)

=== Code Review Summary ===
 <think>
Okay, so I'm trying to figure out how to analyze this Python code. The function is called add_numbers and it takes two parameters, a and b. Inside the function, there's a comment saying "This function adds two numbers" but nothing else. Then it returns the sum of a and b.

Hmm, that seems pretty straightforward. But I remember that in programming, especially with other languages like JavaScript or C++, comments aren't just for documentation; they can also be errors if they're not well-formatted. Let me think about the syntax here. The comment is right after the function definition but before the return statement. That's correct structure-wise because each line needs to start with a number or keyword and end with a semicolon unless it's part of an expression.

Wait, in Python, do you need a semicolon after the colon? I'm pretty sure that's the case. So maybe there should be a semicolon after the function definition. But looking at the code, there isn't one. That could be an oversight because it's missing syntax errors that might cause issues, especially when debugging.

Also, in Python, you can have multiple return statements with each subsequent line following on without another semicolon, but having a semicolon usually makes it clearer and less ambiguous for others to read. So the absence of a semicolon could lead to confusion or potential bugs where the function isn't correctly defined.

Another thing to consider is the variable names. The function takes a and b as parameters, which are probably integers since it's adding them. But if they were different data types, like strings, that might cause issues elsewhere. However, without seeing how this function is called, I can't tell if that's an issue here.

I also notice that the code could be improved by following standard Python conventions for readability. Maybe formatting each line correctly and using consistent indentation would make it easier to read and maintain in the future. The comments are good but not every day.

So putting this all together, the function is correct in functionality, but there are some syntax issues like missing semicolons after the colon and possibly an inconsistent structure that could confuse others. These could lead to bugs if someone reads through the code without looking closely.
</think>

The provided Python function `add_numbers(a, b)` appears correct in terms of functionality. However, it contains a few syntactical inconsistencies that may affect readability:

1. **Missing Semicolon After Function Definition**: The line after the colon lacks a semicolon, which is not standard in Python and could lead to parsing errors.

2. **Inconsistent Indentation**: There's inconsistent formatting with some lines having proper indentation and others being less formatted, which might make it harder for others to read and maintain.

3. **Function Name vs. Documentation Comment**: The comment "This function adds two numbers" is present but not a strong recommendation for readability over the straightforward implementation.
...
    return a + b
```
print("\n=== DeepSeek Analysis ===\n", deepseek_result)

=== DeepSeek Analysis ===
 <think>
Alright, so I'm trying to understand why this function was flagged as having potential issues with an AI system. Let me walk through it step by step.

First, looking at the code:

def add_numbers(a, b):
    # This function adds two numbers
    return a + b

This looks simple enough. The function takes two parameters and returns their sum. But maybe there are some things I'm missing about how this works in terms of Python syntax or best practices.

Okay, starting with the comment. It says "This function adds two numbers." While that's a good suggestion, it's not a strong recommendation for readability. Comments can be confusing if they're not placed correctly or if they change too often. In this case, the comment might be redundant because the function is straightforward, but it's still acceptable as it explains what the function does.

Next, there's a line after the comment with just "This function adds two numbers" without any semicolon. I think in Python, you need to use a semicolon only if it's part of an expression and not for comments unless it's the last statement in the code block. So maybe this is causing some confusion or parsing issues. It would be better to follow standard syntax where each line ends with a semicolon.

Then, there are other lines after the function definition without proper indentation. For example:

    # This comment again
    return a + b

These extra lines could make it hard for others to read and understand the code. Proper indentation makes the structure clear, so I'd suggest adding consistent indentation here.

Lastly, perhaps the documentation is redundant because the function itself is self-explanatory. It's better to keep comments minimal when possible. Maybe just a brief note about what each parameter is would be sufficient without cluttering the code with unnecessary text.
...
2. Ensure proper indentation for each line.
3. Remove the extra comments and focus on the code's functionality.

These changes would make the function easier to understand without compromising its clarity.

Step 6: Wrap the python code in Flask service

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/review', methods=['POST'])
def review_code():
    data = request.get_json()
    code_snippet = data.get('code_snippet')

    if not code_snippet:
        return jsonify({"error": "No code snippet provided"}), 400

    review_result = code_review_chain.run({"code_snippet": code_snippet})

    if "flagged" in review_result.lower():
        deepseek_result = deepseek_chain.run({
            "flagged_code": code_snippet,
            "issues_summary": review_result
        })
        return jsonify({
            "review_result": review_result,
            "deepseek_analysis": deepseek_result
        }), 200

    return jsonify({
        "review_result": review_result,
        "deepseek_analysis": None
    }), 200

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Conclusion

This is a simple but powerful pattern, LLMs as AI Code Reviewers, paired with good prompt design and production-grade APIs, this can change the way teams handle code quality, especially in distributed and asynchronous workflows.

Ai Agents Army Generative Ai Tools Langchain Python

Medium

Running DeepSeek R1 (1.5B) Locally with Ollama on an RTX 3050

By Muhammad Ishaq · 3 min read · Apr 11, 2025 · Medium

How to run DeepSeek R1 (1.5B) locally on your NVIDIA RTX 3050 laptop GPU using Ollama for fast, private LLM inference.

Recently, I tested out the DeepSeek R1 1.5B model using Ollama — a simple and efficient tool for running LLMs locally. I was impressed by how smoothly it ran on my NVIDIA GeForce RTX 3050 laptop GPU, especially for a model that supports reasoning, code generation, and multi-step agentic workflows.

In this post, I'll walk you through how to get it up and running on your own machine.

Why DeepSeek R1?

The DeepSeek R1 series focuses on high-performance, open-source LLMs with strong reasoning and programming capabilities. The 1.5B parameter model is lightweight enough to run locally while still delivering solid results for tasks like:

Code generation
Mathematical reasoning
Instruction following
Multi-turn workflows

It's a great option if you're interested in experimenting with local AI agents or building lightweight AI tools without relying on cloud APIs.

My Setup

GPU: NVIDIA GeForce RTX 3050 (6GB VRAM)
Driver Version: 566.49
CUDA: 12.7
OS: Windows 11
Ollama: Latest (as of April 2025)

How to Run DeepSeek R1 (1.5B) Locally with Ollama

Step 1: Install Ollama

Go to the official Ollama website and download the installer for your platform (Windows in your case).
Run the installer. It sets up everything, including GPU support (if compatible drivers are installed).

Step 2: Open Terminal / Command Prompt

Once installed, open Command Prompt or Windows Terminal.

Step 3: Pull & Run the DeepSeek R1 Model

Use the following command to download and start the model:

ollama run deepseek-r1:1.5b

Pull the DeepSeek R1 (1.5B) model
Set up a local instance
Start an interactive chat

You'll now see a blinking prompt where you can begin chatting with the model.

Step 4: Check GPU Compatibility

Use the following command to check gpu specs of your local gpu:

nvidia-smi

Specs of my GPU

See output of nvidia-smi for details

Step 5: Run Chatbox and Ollama

This screenshot captures the Settings panel of the Chatbox interface, showing integration with the Ollama API to run a local language model. The API Host is set to http://127.0.0.1:11434, confirming that the model is being served locally on the machine. The selected model is deepseek-r1:1.5b, a 1.5 billion parameter version of the DeepSeek R1 model, known for its efficient performance on lightweight hardware.

Under the Advanced section:

Max Message Count in Context is set to 20, which controls how many past messages are retained for context in the conversation.
Temperature is set to 0.7, offering a good balance between creativity and reliability in the model's responses.

This configuration shows a locally hosted AI chat assistant running through Ollama, giving the user full control over performance, privacy, and customization.

Ollama running on chatbotx with real time GPU inference

Running DeepSeek R1 (1.5B) locally with Ollama is a powerful way to leverage AI models directly on your machine, offering quick inference and hands-on experimentation without relying on cloud APIs. Whether you're working on code generation, reasoning tasks, or more complex AI workflows, this setup allows you to explore the potential of modern language models without the need for constant internet access.

With the capability to run smoothly even on mid-range GPUs like the NVIDIA GeForce RTX 3050, it's a great option for developers, researchers, and enthusiasts who want to build and test AI models locally.

Give it a try and see how it can enhance your AI projects!

I have also attached a video for the above testing on my linkedin profile:

Watch the LinkedIn Video

Medium

Multi AI Agent Systems with N8N

By Muhammad Ishaq · Medium

How to leverage the n8n framework to automate business processes using multi-AI agent systems and no-code/low-code strategies.

Hey everyone! 👋

Today, I'm excited to talk about multi-AI agent systems and how we can leverage the n8n framework to automate business processes using no-code and low-code strategies.

What Are AI Agent Systems?

Before diving in, let's first understand what an AI Agent system actually is. Simply put, an AI Agent is a system designed to replace human intervention, autonomously handling tasks with minimal to no manual input. Sounds fascinating, right? 🚀

The Rise of AI in Automation

With AI evolving at an unprecedented pace, businesses worldwide are integrating AI-driven solutions to streamline operations and enhance efficiency. Companies are rapidly transitioning to AI-powered automation, optimizing workflows, and reducing manual effort.

Why n8n?

Now, let's talk about n8n — a powerful automation framework that requires little to no coding experience. It offers built-in connectors, making it incredibly easy to develop cutting-edge automated systems without needing deep technical expertise.

The Multi-AI Agent System

I've developed a multi-AI agent system that enables seamless integration with private company data. With this system, businesses can interact with documents and policies effortlessly, leveraging AI for smart, automated decision-making. Pretty cool, right? 😎

AI-driven automation is the future, and frameworks like n8n make it accessible to everyone. Let's explore how we can harness its potential to redefine business efficiency! 💡✨

Multi-AI Agent System Architecture Diagram

In this tutorial, we will walk through the steps of building an AI chatbot using OpenAI's Chat Model, Memory, and Pinecone Vector Store for enhanced retrieval capabilities. The architecture is designed to process user messages, store conversation history, and retrieve relevant context for improved responses.

Prerequisites

An OpenAI API key
A Pinecone account and API key
A development environment (Python, Node.js, or any other backend framework)

Building an AI Chatbot with OpenAI and Pinecone

Introduction

Step 1: Understanding the Architecture

Chat Message Received: The bot receives a user's message as input.
AI Agent: The core component that processes the message and determines the response strategy.
OpenAI Chat Model: Generates AI responses based on the conversation history.
Memory: Stores past conversations to maintain context.
Pinecone Vector Store: Stores and retrieves embeddings for efficient similarity search.
Embeddings OpenAI: Converts text into vector embeddings for better retrieval and understanding.

Step 2: Setting Up OpenAI Chat Model

Sign up for OpenAI and obtain an API key.
Use the API to generate responses for user inputs.
Ensure that the model you choose (e.g., GPT-4) is capable of handling conversations effectively.

Step 3: Implementing Memory for Conversation Continuity

Store user messages and chatbot responses to maintain conversation history.
Use a database, an in-memory storage system, or a file to track conversations.
Retrieve past interactions to provide contextually relevant responses.

Step 4: Setting Up Pinecone for Vector Storage

Create an account on Pinecone and get an API key.
Initialize a Pinecone index to store and retrieve text embeddings.
Convert user messages into vector embeddings using OpenAI's embedding models.
Store these embeddings in Pinecone for future retrieval.
Retrieve similar messages from past conversations to improve response quality.

Step 5: Integrating Everything into the AI Agent

Process incoming user messages through the AI agent.
Retrieve relevant past conversations using the Pinecone Vector Store.
Use OpenAI's Chat Model to generate a response based on retrieved context.
Store new interactions in memory and Pinecone for future reference.
Continuously improve the chatbot's ability to handle queries by refining the retrieval and response process.

Conclusion

By following these steps, you've successfully built an AI chatbot that can maintain context using memory and retrieve relevant information using Pinecone. This architecture ensures efficient and meaningful conversations, making the chatbot more intelligent over time.

Let me know in the comments if you have any questions or need further clarification!

N8n

Medium

AWS Fundamentals

By Muhammad Ishaq · 5 min read · Oct 4, 2024 · Medium

This article will cover the fundamentals of Amazon web services cloud infrastructure.

Region in AWS: In AWS, a region is a separate geographic area. Each region has multiple, isolated locations known as Availability Zones. Each availability zone is isolated but the availability zones are connected through low-latency links.
VPC in AWS: A VPC is a virtual private cloud service. It is an isolated section of AWS. It helps to host your resources such as virtual servers. It is a collection of subnets. Each subnet would then host your resources. A default VPC is created once an account is created, it has a CIDR (Class Inter-Domain Routing) which is a group of IP addresses that shares the same network prefix and number of bits. The default CIDR block is 172.31.0.0/16.
Elastic Compute Cloud (EC2) AWS: A service provided by AWS that offers scalable cloud computing capacity.

You can launch EC2 instances for both Linux and Windows operating systems. Key steps involved in configuring an EC2 instance include:
1. Choose an Amazon Machine Image (AMI) (Linux, Windows)
2. Select the memory type of AMI
  - EBS (Elastic Block Storage) Backend or an Instance Store AMI
  - In the Instance Store AMI: The storage is part of the physical server on which a virtual server is hosted, advantage of this is the power of latency since it is physically attached.
  - EBS: Network attached storage or temporary storage when the instance (EC2) is stopped or terminated, and data is deleted.
3. Choose the type of AMI
  - T2 - Burstable performance
  - M3, M4 - Good balance of compute, memory, network
  - C3, C4 - Compute Optimized
  - G2 - Graphic Intensive Applications like (Deep learning or Cuda)
4. Specify the VPC
5. Specify the subnet
6. Enable Public IP allocation
7. Add storage, add tags
8. Add security groups
9. Create or choose an existing pair for login (SSH KEY Pairs)
Elastic IP in AWS: An elastic IP is a static IP address that can be attached to an instance. This can talk to the internet or the deployed services. An instance can have a public IP and a private IP. It can be attached and detached to other instances (in case of instance failures).

Use cases: It enhances accessibility, aids in setting up DNS records, and allows for efficient management of cloud infrastructure.
Reference Link: https://www.scaler.com/topics/elastic-ip-in-aws/
NACL (Network Access Control List) AWS: It is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. Network Access control lists can be used to segregate which traffic can come within a subnet. If you get abnormal traffic from a set of IP instances, you can block traffic from those IP instances, for example, a common use-case of a DDOS attack.

Note: Inbound rules in a network ACL (Access Control List) govern the traffic entering the subnet, while outbound rules control the traffic leaving the subnet.
Security Groups in AWS: It acts as a firewall to control access to and from your instances.

Example 1: Web-Server - Only allow traffic to port 80 and 443, port 22 for SSH.
Example 2: Database-Server - Only allow traffic to port 3306 if you are hosting MySQL instances.
Simple Storage Service S3: Reliable, scalable, & low latency service provided by AWS to its users to store data. Data can be stored virtually inside s3. S3 size varies from 0 to 5 TB depending upon the cost.

Objects are stored in a bucket. The bucket name needs to be unique across all regions.

Example of an object in s3 bucket:
- mlvoice-notes is the bucket name
- s3.us-east-1 is the region name
- Docs is the folder name
Note: We can also configure the aws access control list in the aws console which helps us to manage access to buckets and objects.

Conclusion

AWS offers a comprehensive and scalable cloud infrastructure with a wide array of services to meet diverse business and technical needs. From regions and availability zones that ensure high availability, to VPCs and EC2 instances that provide flexible networking and compute power, AWS equips users with the tools needed to build, secure, and scale applications in the cloud. Services like Elastic IP, Network ACLs, Security Groups, and S3 enhance accessibility, security, and storage efficiency. By leveraging these fundamental AWS services, businesses can focus on innovation while AWS handles the underlying infrastructure with reliability and scalability.

There are many more services and features that AWS offers, and I plan to cover additional services in future articles. AWS's ever-growing suite of tools continues to simplify cloud management, making it easier for businesses to innovate and scale effectively. Stay tuned for more insights on AWS's wide range of services! :)

AWS Cloud Infrastructure

Welcome to Our Blog

Latest from Medium

🔧 The Setup

🤖 What the Agent Actually Does

🧠 Why This Matters

🏁 Final Thoughts

Overview

1. Pick Your Inference Strategy

Limitations & Next Steps

Conclusion

💻 The Code Setup:

Step 1: Install Ollama

Step 2: Open Terminal / Command Prompt

Step 3: Pull & Run the DeepSeek R1 Model

Step 4: Create chain using langchain

Step 5: Test the chain with code snippet or any bug

Step 6: Wrap the python code in Flask service

Conclusion

Why DeepSeek R1?

My Setup

How to Run DeepSeek R1 (1.5B) Locally with Ollama

Step 1: Install Ollama

Step 2: Open Terminal / Command Prompt

Step 3: Pull & Run the DeepSeek R1 Model

Step 4: Check GPU Compatibility

Specs of my GPU

Step 5: Run Chatbox and Ollama

Ollama running on chatbotx with real time GPU inference

What Are AI Agent Systems?

The Rise of AI in Automation

Why n8n?

The Multi-AI Agent System

Multi-AI Agent System Architecture Diagram

Prerequisites

Building an AI Chatbot with OpenAI and Pinecone

Introduction

Step 1: Understanding the Architecture

Step 2: Setting Up OpenAI Chat Model

Step 3: Implementing Memory for Conversation Continuity

Step 4: Setting Up Pinecone for Vector Storage

Step 5: Integrating Everything into the AI Agent

Conclusion

Conclusion

Featured Articles