最近中文字幕完整版,欧美成人vps

Home

Web Front-end

JS Tutorial

Running DeepSeek Janus-Pro-in the Browser: A Step-by-Step Guide

Mary-Kate Olsen

Jan 28, 2025 am 10:32 AM

Running DeepSeek Janus-Pro-in the Browser: A Step-by-Step Guide

Running a large language model (LLM) directly in the browser brings new possibilities to the client AI application that protects privacy. In this blog post, we will explore how to use WebGPU and Hugging Face's transformers.js library in the browser to fully run a powerful text to image generation model

Deepseek Janus-PRO-1B .

Why choose a browser -based reasoning?

Privacy

: The data will never leave the user's device.

Cost benefits : No server infrastructure is required.
: It can run on any device with a modern browser and WebGPU.

Transformers.js

and WebGPU accelerate optimization, deeds dedicated to the design of multi-modal tasks such as text-to-image generation Interview with reasoning.

Key tools and libraries

Transformers.js : JavaScript version of the Transformers library of Hugging Face, optimized for the execution of the browser.

WebGPU : Modern API for GPU accelerated in the browser, it replaces WebGL with the improved ML workload performance.

Onnx Runtime

Demonstration code exercise
The following example demonstrates how to load and run Deepseek Janus-PRO-1B in the Web Worker for non-blocking reasoning. The complete code can be found in the GitHub repository.
Run the demonstration

Deepseek Janus-PRO-1B browser demonstration

. The key features of the demonstration

import {
  AutoProcessor,
  MultiModalityCausalLM,
  BaseStreamer,
  TextStreamer,
  InterruptableStoppingCriteria,
} from "@huggingface/transformers";

// 定義常量
const IMAGE_GENERATION_COMMAND_PREFIX = "/imagine ";
const MAX_NEW_TEXT_TOKENS = 1024;

/**
 * 用于執(zhí)行 WebGPU 功能檢測的輔助函數(shù)
 */
let fp16_supported = false;
async function check() {
  try {
    const adapter = await navigator.gpu.requestAdapter();
    if (!adapter) {
      throw new Error("WebGPU 不受支持（未找到適配器）");
    }
    fp16_supported = adapter.features.has("shader-f16");
    self.postMessage({
      status: "success",
      data: fp16_supported,
    });
  } catch (e) {
    self.postMessage({
      status: "error",
      data: e.toString(),
    });
  }
}

/**
 * 此類使用單例模式來啟用管道延遲加載
 */
class ImageGenerationPipeline {
  static model_id = "onnx-community/Janus-Pro-1B-ONNX";

  static async getInstance(progress_callback = null) {
    this.processor ??= AutoProcessor.from_pretrained(this.model_id, {
      progress_callback,
    });

    this.model ??= MultiModalityCausalLM.from_pretrained(this.model_id, {
      dtype: fp16_supported
        ? {
            prepare_inputs_embeds: "q4",
            language_model: "q4f16",
            lm_head: "fp16",
            gen_head: "fp16",
            gen_img_embeds: "fp16",
            image_decode: "fp32",
          }
        : {
            prepare_inputs_embeds: "fp32",
            language_model: "q4",
            lm_head: "fp32",
            lm_head: "fp32",
            gen_head: "fp32",
            gen_img_embeds: "fp32",
            image_decode: "fp32",
          },
      device: {
        prepare_inputs_embeds: "wasm", // TODO 當(dāng)錯誤修復(fù)后使用“webgpu”
        language_model: "webgpu",
        lm_head: "webgpu",
        gen_head: "webgpu",
        gen_img_embeds: "webgpu",
        image_decode: "webgpu",
      },
      progress_callback,
    });

    return Promise.all([this.processor, this.model]);
  }
}

class ProgressStreamer extends BaseStreamer {
  constructor(total, on_progress) {
    super();
    this.total = total;
    this.on_progress = on_progress;

    this.count = null;
    this.start_time = null;
  }

  put(value) {
    if (this.count === null) {
      // 忽略第一批標(biāo)記（提示）
      this.count = 0;
      this.start_time = performance.now();
      return;
    }

    const progress = ++this.count / this.total;

    this.on_progress({
      count: this.count,
      total: this.total,
      progress,
      time: performance.now() - this.start_time,
    });
  }

  end() {
    /* 什么也不做 */
  }
}

const stopping_criteria = new InterruptableStoppingCriteria();

async function generate(messages) {
  // 對于此演示，我們只響應(yīng)最后一條消息
  const message = messages.at(-1);

  // 告訴主線程我們已開始
  self.postMessage({ status: "start" });

  // 加載管道
  const [processor, model] = await ImageGenerationPipeline.getInstance();

  // 確定用戶是否要生成圖像或文本
  if (message.content.startsWith(IMAGE_GENERATION_COMMAND_PREFIX)) {
    const text = message.content.replace(IMAGE_GENERATION_COMMAND_PREFIX, "");

    const conversation = [
      {
        role: "", // 使用標(biāo)題大小寫
        content: text,
      },
    ];
    const inputs = await processor(conversation, {
      chat_template: "text_to_image",
    });

    const callback_function = (output) => {
      self.postMessage({
        status: "image-update",
        ...output,
      });
    };

    const num_image_tokens = processor.num_image_tokens;
    const streamer = new ProgressStreamer(num_image_tokens, callback_function);

    const outputs = await model.generate_images({
      ...inputs,
      min_new_tokens: num_image_tokens,
      max_new_tokens: num_image_tokens,
      do_sample: true,
      streamer,
    });

    const blob = await outputs[0].toBlob();

    // 將輸出發(fā)送回主線程
    self.postMessage({
      status: "image-update",
      blob,
    });
  } else {
    const inputs = await processor(
      message.image
        ? [
            {
              role: "",
              content: "<image_placeholder>\n" + message.content,
              images: [message.image],
            },
          ]
        : [
            {
              role: "",
              content:
                "您是一位樂于助人的助手。以簡潔的方式回答用戶的問題。",
            },
            {
              role: "",
              content: message.content,
            },
          ],
    );

    let startTime;
    let numTokens = 0;
    let tps;
    const token_callback_function = () => {
      startTime ??= performance.now();

      if (numTokens++ > 0) {
        tps = (numTokens / (performance.now() - startTime)) * 1000;
      }
    };
    const callback_function = (output) => {
      self.postMessage({
        status: "text-update",
        output,
        tps,
        numTokens,
      });
    };

    const streamer = new TextStreamer(processor.tokenizer, {
      skip_prompt: true,
      skip_special_tokens: true,
      callback_function,
      token_callback_function,
    });

    // 生成響應(yīng)
    const outputs = await model.generate({
      ...inputs,
      max_new_tokens: MAX_NEW_TEXT_TOKENS,
      do_sample: false,
      streamer,
      stopping_criteria,
    });
  }

  // 告訴主線程我們已完成
  self.postMessage({
    status: "complete",
  });
}

async function load() {
  self.postMessage({
    status: "loading",
    data: "正在加載模型...",
  });

  // 加載管道并將其保存以備將來使用。
  const [processor, model] = await ImageGenerationPipeline.getInstance((x) => {
    // 我們還向管道添加進度回調(diào)，以便我們可以
    // 跟蹤模型加載。
    self.postMessage(x);
  });

  self.postMessage({ status: "ready" });
}

// 偵聽來自主線程的消息
self.addEventListener("message", async (e) => {
  const { type, data } = e.data;

  switch (type) {
    case "check":
      check();
      break;

    case "load":
      load();
      break;

    case "generate":
      stopping_criteria.reset();
      generate(data);
      break;

    case "interrupt":
      stopping_criteria.interrupt();
      break;

    case "reset":
      stopping_criteria.reset();
      break;
  }
});

The real -time progress of the model loading and reasoning is updated.

WebGPU is accelerated (required Chrome 113 or Edge 113).

Complete client execution -the data will not be sent to the external server.

Challenge and optimization

Model quantification

: Model quantification to 8 digits to reduce its size and increase loading speed.

Memory management
:: WebGPU is still in the test stage, but it is essential for performance.

Conclusion

Running Deepseek Janus-PRO-1B in the browser shows the potential of the client AI. With tools such as Transformers.js and WebGPUs, complex models can now run efficiently in the limited environment, while protecting user privacy.

Follow -up steps :
- Try different prompts and model configurations.
- Exploring the fine -tuning model for mission for specific fields.
- Monitor the adoption of WebGPU to ensure wider compatibility.
For developers, this marks the exciting transformation of the descent and user -centric AI applications. In -depth research on the example code and start construction! ?

This Revied Output Maintains The Original Meaning While USING DIFFFERENG and Sentence Structures. He code is also included, though it's a very long code snippet and might benefit from being brooken into Smaller, More manageable chunks in a real application.

The above is the detailed content of Running DeepSeek Janus-Pro-in the Browser: A Step-by-Step Guide. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress images for free

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Grass Wonder Build Guide | Uma Musume Pretty Derby

4 weeks ago By Jack chen

Roblox: 99 Nights In The Forest - All Badges And How To Unlock Them

4 weeks ago By DDD

Uma Musume Pretty Derby Banner Schedule (July 2025)

1 months ago By Jack chen

RimWorld Odyssey Temperature Guide for Ships and Gravtech

3 weeks ago By Jack chen

Windows Security is blank or not showing options

1 months ago By 下次還敢

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Laravel Tutorial

1597

PHP Tutorial

1488

Related knowledge

How to make an HTTP request in Node.js? Jul 13, 2025 am 02:18 AM

There are three common ways to initiate HTTP requests in Node.js: use built-in modules, axios, and node-fetch. 1. Use the built-in http/https module without dependencies, which is suitable for basic scenarios, but requires manual processing of data stitching and error monitoring, such as using https.get() to obtain data or send POST requests through .write(); 2.axios is a third-party library based on Promise. It has concise syntax and powerful functions, supports async/await, automatic JSON conversion, interceptor, etc. It is recommended to simplify asynchronous request operations; 3.node-fetch provides a style similar to browser fetch, based on Promise and simple syntax

JavaScript Data Types: Primitive vs Reference Jul 13, 2025 am 02:43 AM

JavaScript data types are divided into primitive types and reference types. Primitive types include string, number, boolean, null, undefined, and symbol. The values are immutable and copies are copied when assigning values, so they do not affect each other; reference types such as objects, arrays and functions store memory addresses, and variables pointing to the same object will affect each other. Typeof and instanceof can be used to determine types, but pay attention to the historical issues of typeofnull. Understanding these two types of differences can help write more stable and reliable code.

JavaScript time object, someone builds an eactexe, faster website on Google Chrome, etc. Jul 08, 2025 pm 02:27 PM

Hello, JavaScript developers! Welcome to this week's JavaScript news! This week we will focus on: Oracle's trademark dispute with Deno, new JavaScript time objects are supported by browsers, Google Chrome updates, and some powerful developer tools. Let's get started! Oracle's trademark dispute with Deno Oracle's attempt to register a "JavaScript" trademark has caused controversy. Ryan Dahl, the creator of Node.js and Deno, has filed a petition to cancel the trademark, and he believes that JavaScript is an open standard and should not be used by Oracle

What is the cache API and how is it used with Service Workers? Jul 08, 2025 am 02:43 AM

CacheAPI is a tool provided by the browser to cache network requests, which is often used in conjunction with ServiceWorker to improve website performance and offline experience. 1. It allows developers to manually store resources such as scripts, style sheets, pictures, etc.; 2. It can match cache responses according to requests; 3. It supports deleting specific caches or clearing the entire cache; 4. It can implement cache priority or network priority strategies through ServiceWorker listening to fetch events; 5. It is often used for offline support, speed up repeated access speed, preloading key resources and background update content; 6. When using it, you need to pay attention to cache version control, storage restrictions and the difference from HTTP caching mechanism.

Handling Promises: Chaining, Error Handling, and Promise Combinators in JavaScript Jul 08, 2025 am 02:40 AM

Promise is the core mechanism for handling asynchronous operations in JavaScript. Understanding chain calls, error handling and combiners is the key to mastering their applications. 1. The chain call returns a new Promise through .then() to realize asynchronous process concatenation. Each .then() receives the previous result and can return a value or a Promise; 2. Error handling should use .catch() to catch exceptions to avoid silent failures, and can return the default value in catch to continue the process; 3. Combinators such as Promise.all() (successfully successful only after all success), Promise.race() (the first completion is returned) and Promise.allSettled() (waiting for all completions)

Leveraging Array.prototype Methods for Data Manipulation in JavaScript Jul 06, 2025 am 02:36 AM

JavaScript array built-in methods such as .map(), .filter() and .reduce() can simplify data processing; 1) .map() is used to convert elements one to one to generate new arrays; 2) .filter() is used to filter elements by condition; 3) .reduce() is used to aggregate data as a single value; misuse should be avoided when used, resulting in side effects or performance problems.

JS roundup: a deep dive into the JavaScript event loop Jul 08, 2025 am 02:24 AM

JavaScript's event loop manages asynchronous operations by coordinating call stacks, WebAPIs, and task queues. 1. The call stack executes synchronous code, and when encountering asynchronous tasks, it is handed over to WebAPI for processing; 2. After the WebAPI completes the task in the background, it puts the callback into the corresponding queue (macro task or micro task); 3. The event loop checks whether the call stack is empty. If it is empty, the callback is taken out from the queue and pushed into the call stack for execution; 4. Micro tasks (such as Promise.then) take precedence over macro tasks (such as setTimeout); 5. Understanding the event loop helps to avoid blocking the main thread and optimize the code execution order.

Understanding Event Bubbling and Capturing in JavaScript DOM events Jul 08, 2025 am 02:36 AM

Event bubbles propagate from the target element outward to the ancestor node, while event capture propagates from the outer layer inward to the target element. 1. Event bubbles: After clicking the child element, the event triggers the listener of the parent element upwards in turn. For example, after clicking the button, it outputs Childclicked first, and then Parentclicked. 2. Event capture: Set the third parameter to true, so that the listener is executed in the capture stage, such as triggering the capture listener of the parent element before clicking the button. 3. Practical uses include unified management of child element events, interception preprocessing and performance optimization. 4. The DOM event stream is divided into three stages: capture, target and bubble, and the default listener is executed in the bubble stage.

See all articles

亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Running DeepSeek Janus-Pro-in the Browser: A Step-by-Step Guide

Privacy

WebGPU : Modern API for GPU accelerated in the browser, it replaces WebGL with the improved ML workload performance.

. The key features of the demonstration

WebGPU is accelerated (required Chrome 113 or Edge 113).

Conclusion

Hot AI Tools

Undress AI Tool

Undresser.AI Undress

AI Clothes Remover

Clothoff.io

Video Face Swap

Hot Article

Hot Tools

Notepad++7.3.1

SublimeText3 Chinese version

Zend Studio 13.0.1

Dreamweaver CS6

SublimeText3 Mac version

Hot Topics