亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

目次
Learning Objectives
Table of contents
Prerequisites
What is SAM 2?
Key Features of SAM 2
Core Components of SAM 2
Applications of SAM 2
What is Image Segmentation?
Setting Up and Utilizing SAM 2 for Image Segmentation
Step 1: Check GPU Availability and Set Up the Environment
Explanation
Step 2: Clone the SAM 2 Repository and Install Dependencies
Step 3: Download Model Checkpoints
Step 4: Download Sample Images
Step 5: Set Up the SAM 2 Model and Load an Image
Step 6: Visualize the Segmentation Masks
Step7: Use Box Prompts for Segmentation
Step8: Get Bounding Boxes and Perform Segmentation
Step9: Use Point Prompts for Segmentation
Key Points to Remember When Working SAM 2
Revolutionizing Photo and Video Editing
Real-Time Segmentation and Editing
Creative Possibilities for All
Automating Complex Tasks
Democratizing Content Creation
Impact on VFX Industry
Impressive Potential of SAM 2
Conclusion
Key Takeaways
Frequently Asked Questions
ホームページ テクノロジー周辺機器 AI SAM 2を使用した畫像とビデオセグメンテーションのマスター

SAM 2を使用した畫像とビデオセグメンテーションのマスター

Apr 14, 2025 am 10:16 AM

This guide will walk you through what ?Segment Anything Model 2? is, how it works, and how you’ll utilize it to portion objects in pictures and videos. It?offers state-of-the-art execution and adaptability in fragmenting objects into pictures, making it an important resource for a assortment of computer vision applications. This directly points to supplying a nitty-gritty, step-by-step walkthrough for setting up and utilizing SAM 2 to perform picture division. By taking this direct, you will be able to produce division covers for pictures utilizing both box and point prompts.

Learning Objectives

  • Describe the key features and applications of the Segment Anything Model 2 SAM 2 in image and video segmentation.
  • Successfully configure a CUDA-enabled environment, install necessary dependencies, and clone the Segment Anything Model 2 repository for image segmentation tasks.
  • Apply SAM 2 to generate segmentation masks for images using both box and point prompts and visualize the results effectively.
  • Evaluate how SAM 2 can revolutionize photo and video editing by enabling real-time segmentation, automating complex tasks, and democratizing content creation for a broader audience.

This article was published as a part of the?Data Science Blogathon.

Table of contents

  • Prerequisites
  • What is SAM 2?
  • Setting Up and Utilizing SAM 2 for Image Segmentation
  • Key Points to Remember When Working SAM 2
  • Impressive Potential of SAM 2
  • Conclusion
  • Frequently Asked Questions

Prerequisites

Some time recently you begin, guarantee you’ve got a CUDA-enabled GPU for quicker handling. Also, verify that you have Python installed on your machine. This guide assumes you have some basic knowledge of Python and image processing concepts.

What is SAM 2?

Segment Anything Model 2 is an progressed instrument for picture division created by Facebook AI Inquire about (Reasonable). On July 29th, 2024, Meta AI discharged SAM 2, an progressed picture and video division establishment show. SAM 2 empowers clients to supply focuses or boxes in an picture or video to create division covers for particular objects.

Click here to access it

Key Features of SAM 2

  • Advanced Mask Generation: SAM 2 generates high-quality segmentation masks based on user inputs, such as points or bounding boxes.
  • Flexibility: The model supports both image and video segmentation.
  • Speed and Efficiency: With CUDA support, SAM 2 can perform segmentation tasks rapidly, making it suitable for real-time applications.

Core Components of SAM 2

  • Image Encoder: Encodes the input image for processing.
  • Prompt Encoder: Converts user-provided points or boxes into a format the model can use.
  • Mask Decoder: Generates the final segmentation mask based on the encoded inputs.

Applications of SAM 2

Let us now look into the applications of SAM 2 below:

  • Photo and Video Editing: SAM 2 allows for precise object segmentation, enabling detailed edits and creative effects in photos and videos.
  • Autonomous Vehicles: In autonomous driving, SAM 2 can be used to identify and track objects like pedestrians, vehicles, and road signs in real-time.
  • Medical Imaging: SAM 2 can assist in segmenting anatomical structures in medical images, aiding in diagnostics and treatment planning.

What is Image Segmentation?

Image segmentation is a computer vision technique that involves dividing an image into multiple segments or regions to simplify its analysis. Each segment represents a different object or part of an object within the image, making it easier to identify and analyze specific elements.

Types of Image Segmentation

  • Semantic Segmentation: Classifies each pixel into a predefined category.
  • Instance Segmentation: Differentiates between different instances of the same object category.
  • Panoptic Segmentation: Combines semantic and instance segmentation.

Setting Up and Utilizing SAM 2 for Image Segmentation

We’ll guide you through the process of setting up the Segment Anything Model 2 (SAM 2) in your environment and utilizing its powerful capabilities for precise image segmentation tasks. From ensuring your GPU is ready to configuring the model and applying it to real images, each step will be covered in detail to help you harness the full potential of SAM 2.

Step 1: Check GPU Availability and Set Up the Environment

First, let’s ensure that your environment is properly set up, starting with checking for GPU availability and setting the current working directory.

# Check GPU availability and CUDA version
!nvidia-smi
!nvcc --version

# Import necessary modules
import os

# Set the current working directory
HOME = os.getcwd()
print("HOME:", HOME)

Explanation

  • !nvidia-smi and !nvcc –version: These commands check if your framework incorporates a CUDA-enabled GPU and show the CUDA form.
  • os.getcwd(): This work gets the current working catalog, which can be utilized for overseeing record ways.

Step 2: Clone the SAM 2 Repository and Install Dependencies

Next, we need to clone the SAM 2 repository from GitHub and install the required dependencies.

# Clone the SAM 2 repository
!git clone https://github.com/facebookresearch/segment-anything-2.git

# Change to the repository directory
%cd segment-anything-2

# Install the SAM 2 package
!pip install -e .

# Install additional packages
!pip install supervision jupyter_bbox_widget

Explanation

  • !git clone: Clones the SAM 2 repository to your local machine.
  • %cd: Changes the directory to the cloned repository.
  • !pip install -e .: Installs the SAM 2 package in editable mode.
  • !pip install supervision jupyter_bbox_widget: Installs additional packages required for visualization and bounding box widget support.

Step 3: Download Model Checkpoints

Model checkpoints are essential, as they contain the trained parameters of SAM 2. We will download multiple checkpoints for different model sizes.

# Create a directory for checkpoints
!mkdir -p checkpoints

# Download the model checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_tiny.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_small.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_base_plus.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_large.pt -P checkpoints

Explanation

  • !mkdir -p checkpoints: Creates a directory for storing model checkpoints.
  • !wget -q … -P checkpoints: Downloads the model checkpoints into the checkpoints directory. Different checkpoints represent models of varying sizes and capabilities.

Step 4: Download Sample Images

For demonstration purposes, we’ll use some sample images. You can also use your images by following similar steps.

# Create a directory for data
!mkdir -p data

# Download sample images
!wget -q https://media.roboflow.com/notebooks/examples/dog.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-2.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-3.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-4.jpeg -P data

Explanation

  • !mkdir -p data: Creates a directory for storing sample images.
  • !wget -q … -P data: Downloads the sample images into the data directory.

Step 5: Set Up the SAM 2 Model and Load an Image

Now, we will set up the SAM 2 model, load an image, and prepare it for segmentation.

import cv2
import torch
import numpy as np
import supervision as sv

from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor
from sam2.automatic_mask_generator import SAM2AutomaticMaskGenerator

# Enable CUDA if available
torch.autocast(device_type="cuda", dtype=torch.bfloat16).__enter__()

if torch.cuda.get_device_properties(0).major >= 8:
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cudnn.allow_tf32 = True

# Set the device to CUDA
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Define the model checkpoint and configuration
CHECKPOINT = "checkpoints/sam2_hiera_large.pt"
CONFIG = "sam2_hiera_l.yaml"

# Build the SAM 2 model
sam2_model = build_sam2(CONFIG, CHECKPOINT, device=DEVICE, apply_postprocessing=False)

# Create the automatic mask generator
mask_generator = SAM2AutomaticMaskGenerator(sam2_model)

# Load an image for segmentation
IMAGE_PATH = "/content/WhatsApp Image 2024-08-02 at 14.17.11_2b223e01.jpg"
image_bgr = cv2.imread(IMAGE_PATH)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

# Generate segmentation masks
sam2_result = mask_generator.generate(image_rgb)

Explanation

  • CUDA Setup: Enables CUDA for faster processing and sets the device to GPU if available.
  • Model Setup: Builds the SAM 2 model using the specified configuration and checkpoint.
  • Image Loading: Loads and converts the sample image to RGB format.
  • Mask Generation: Uses the automatic mask generator to generate segmentation masks for the loaded image.

Step 6: Visualize the Segmentation Masks

We will now visualize the segmentation masks generated by SAM 2.

# Annotate the masks on the image
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)
detections = sv.Detections.from_sam(sam_result=sam2_result)
annotated_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the original and segmented images side by side
sv.plot_images_grid(
    images=[image_bgr, annotated_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

SAM 2を使用した畫像とビデオセグメンテーションのマスター

# Extract and plot individual masks
masks = [
    mask['segmentation']
    for mask in sorted(sam2_result, key=lambda x: x['area'], reverse=True)
]

sv.plot_images_grid(
    images=masks[:16],
    grid_size=(4, 4),
    size=(12, 12)
)

SAM 2を使用した畫像とビデオセグメンテーションのマスター

Explanation:

  • Mask Annotation: Annotates the segmentation masks on the original image.
  • Visualization: Plots the original and segmented images side by side and also plots individual masks.

Step7: Use Box Prompts for Segmentation

Box prompts allow us to specify regions of interest in the image for segmentation.

# Define the SAM 2 Image Predictor
predictor = SAM2ImagePredictor(sam2_model)

# Reload the image
image_bgr = cv2.imread(IMAGE_PATH)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

# Encode the image for bounding box input
import base64

def encode_image(filepath):
    with open(filepath, 'rb') as f:
        image_bytes = f.read()
    encoded = str(base64.b64encode(image_bytes), 'utf-8')
    return "data:image/jpg;base64,"+encoded

# Enable custom widget manager in Colab
IS_COLAB = True

if IS_COLAB:
    from google.colab import output
    output.enable_custom_widget_manager()

from jupyter_bbox_widget import BBoxWidget

# Create a bounding box widget
widget = BBoxWidget()
widget.image = encode_image(IMAGE_PATH)

# Display the widget
widget

SAM 2を使用した畫像とビデオセグメンテーションのマスター

Explanation

  • Image Predictor: Defines the SAM 2 image predictor.
  • Image Encoding: Encodes the image for use with the bounding box widget.
  • Widget Setup: Sets up a bounding box widget for specifying regions of interest.

Step8: Get Bounding Boxes and Perform Segmentation

After specifying the bounding boxes, we can use them to generate segmentation masks.

# Get the bounding boxes from the widget
boxes = widget.bboxes
boxes = np.array([
    [
        box['x'],
        box['y'],
        box['x'] + box['width'],
        box['y'] + box['height']
    ] for box in boxes
])
[{'x': 457, 'y': 341, 'width': 0, 'height': 0, 'label': ''},
 {'x': 205, 'y': 79, 'width': 0, 'height': 1, 'label': ''}]
# Set the image in the predictor
predictor.set_image(image_rgb)

# Generate masks using the bounding boxes
masks, scores, logits = predictor.predict(
    box=boxes,
    multimask_output=False
)

# Convert masks to binary format
masks = np.squeeze(masks)

# Annotate and visualize the masks
box_annotator = sv.BoxAnnotator(color=sv.Color.white())
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)

detections = sv.Detections(
    xyxy=boxes,
    mask=masks.astype(bool)
)

source_image = box_annotator.annotate(scene=image_bgr.copy(), detections=detections)
segmented_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the annotated images
sv.plot_images_grid(
    images=[source_image, segmented_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

SAM 2を使用した畫像とビデオセグメンテーションのマスター

Explanation

  • Bounding Boxes: Retrieves the bounding boxes specified using the widget.
  • Mask Generation: Uses the bounding boxes to generate segmentation masks.
  • Visualization: Annotates and visualizes the masks on the original image.

Step9: Use Point Prompts for Segmentation

Point prompts allow us to specify individual points of interest for segmentation.

# Create point prompts based on bounding boxes
input_point = np.array([
    [
        box['x'] + (box['width'] // 2),
        box['y'] + (box['height'] // 2)
    ] for box in widget.bboxes
])
input_label = np.array([1] * len(input_point))

# Generate masks using the point prompts
masks, scores, logits = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True
)

# Convert masks to binary format
masks = np.squeeze(masks)

# Annotate and visualize the masks
point_annotator = sv.PointAnnotator(color_lookup=sv.ColorLookup.INDEX)
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)

detections = sv.Detections(
    xyxy=sv.mask_to_xyxy(masks=masks),
    mask=masks.astype(bool)
)

source_image = point_annotator.annotate(scene=image_bgr.copy(), detections=detections)
segmented_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the annotated images
sv.plot_images_grid(
    images=[source_image, segmented_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

SAM 2を使用した畫像とビデオセグメンテーションのマスター

Explanation

  • Point Prompts: Creates point prompts based on the bounding boxes.
  • Mask Generation: Uses the point prompts to generate segmentation masks.
  • Visualization: Annotates and visualizes the masks on the original image.

Key Points to Remember When Working SAM 2

Let us now look into few important key points below:

Revolutionizing Photo and Video Editing

  • Potential to transform the photo and video editing industry.
  • Future enhancements may include improved precision, lower computational requirements, and advanced AI integration.

Real-Time Segmentation and Editing

  • Evolution could lead to real-time segmentation and editing capabilities.
  • Allows seamless alterations in videos and images with minimal effort.

Creative Possibilities for All

  • Opens up new creative possibilities for both professionals and amateurs.
  • Simplifies the manipulation of visual content, the creation of stunning effects, and the production of high-quality media.

Automating Complex Tasks

  • Automates intricate segmentation tasks.
  • Significantly accelerates workflows, making sophisticated editing more accessible and efficient.

Democratizing Content Creation

  • Makes high-level editing tools available to a broader audience.
  • Empowers storytellers and inspires innovation across various sectors, including entertainment, advertising, and education.

Impact on VFX Industry

  • Enhances visual effects (VFX) production by streamlining complex processes.
  • Reduces the time and effort required for creating intricate VFX, enabling more ambitious projects and improving overall quality.

Impressive Potential of SAM 2

The Segment Anything Model 2 (SAM 2) stands poised to revolutionize the fields of photo and video editing by introducing significant advancements in precision and computational efficiency. By integrating advanced AI capabilities, SAM 2 will enable more intuitive user interactions and real-time segmentation and editing, allowing seamless alterations with minimal effort. This groundbreaking technology promises to democratize content creation, empowering both professionals and amateurs to manipulate visual content, create stunning effects, and produce high-quality media with ease.

As SAM 2 automates complex segmentation tasks, it will accelerate workflows and make sophisticated editing accessible to a wider audience. This transformation will inspire innovation across various industries, from entertainment and advertising to education. In the realm of visual effects (VFX), SAM 2 will streamline intricate processes, reducing the time and effort needed to create elaborate VFX. This will enable more ambitious projects, elevate the quality of visual storytelling, and open up new creative possibilities in the VFX world.

Conclusion

By following this guide, you have learned how to set up and use the Segment Anything Model 2 (SAM 2) for image segmentation using both box and point prompts. SAM 2 provides powerful and flexible tools for segmenting objects in images, making it a valuable asset for various computer vision tasks. Feel free to experiment with your images and explore the capabilities of SAM 2 further.

Key Takeaways

  • SAM 2 is an advanced tool developed by Meta AI that enables precise and flexible image and video segmentation using both box and point prompts.
  • The model can significantly enhance photo and video editing by automating complex segmentation tasks, making it more accessible and efficient.
  • Setting up SAM 2 requires a CUDA-enabled GPU and a basic understanding of Python and image processing concepts.
  • SAM 2’s capabilities open new possibilities for both professionals and amateurs in content creation, offering real-time segmentation and creative control.
  • The model has the potential to transform various industries, including visual effects, entertainment, advertising, and education, by democratizing high-level editing tools.

Frequently Asked Questions

Q1. What is SAM 2?

A. SAM 2, or Section Anything Show 2, is a?picture and video division show created by Meta AI that permits clients to produce division covers for particular objects by giving box or point prompts.

Q2. What are the prerequisites for utilizing SAM 2?

A. To use SAM 2, you need a CUDA-enabled GPU for faster processing and Python installed on your machine. Basic knowledge of Python and image processing concepts is also helpful.

Q3. How do I set up SAM 2?

A. Set up SAM 2 by checking GPU availability, cloning the SAM 2 repository from GitHub, installing required dependencies, and downloading model checkpoints and sample images for testing.

Q4. What types of prompts can be used with SAM 2 for segmentation?

A. SAM 2 supports both box prompts and point prompts. Box prompts involve specifying regions of interest using bounding boxes, while point prompts involve selecting specific points in the image.

Q5. How can SAM 2 impact photo and video editing?

A. SAM 2 can revolutionize photo and video altering by mechanizing complex division assignments, empowering real-time altering, and making advanced altering apparatuses available to a broader gathering of people, in this manner improving imaginative conceivable outcomes and workflow proficiency.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

以上がSAM 2を使用した畫像とビデオセグメンテーションのマスターの詳細內容です。詳細については、PHP 中國語 Web サイトの他の関連記事を參照してください。

このウェブサイトの聲明
この記事の內容はネチズンが自主的に寄稿したものであり、著作権は原著者に帰屬します。このサイトは、それに相當する法的責任を負いません。盜作または侵害の疑いのあるコンテンツを見つけた場合は、admin@php.cn までご連絡ください。

ホットAIツール

Undress AI Tool

Undress AI Tool

脫衣畫像を無料で

Undresser.AI Undress

Undresser.AI Undress

リアルなヌード寫真を作成する AI 搭載アプリ

AI Clothes Remover

AI Clothes Remover

寫真から衣服を削除するオンライン AI ツール。

Clothoff.io

Clothoff.io

AI衣類リムーバー

Video Face Swap

Video Face Swap

完全無料の AI 顔交換ツールを使用して、あらゆるビデオの顔を簡単に交換できます。

ホットツール

メモ帳++7.3.1

メモ帳++7.3.1

使いやすく無料のコードエディター

SublimeText3 中國語版

SublimeText3 中國語版

中國語版、とても使いやすい

ゼンドスタジオ 13.0.1

ゼンドスタジオ 13.0.1

強力な PHP 統(tǒng)合開発環(huán)境

ドリームウィーバー CS6

ドリームウィーバー CS6

ビジュアル Web 開発ツール

SublimeText3 Mac版

SublimeText3 Mac版

神レベルのコード編集ソフト(SublimeText3)

AI投資家は停滯していますか? AIベンダーと購入、構築、またはパートナーになる3つの戦略的なパス AI投資家は停滯していますか? AIベンダーと購入、構築、またはパートナーになる3つの戦略的なパス Jul 02, 2025 am 11:13 AM

投資は活況を呈していますが、資本だけでは十分ではありません。評価が上昇し、獨特の衰退があるため、AIに焦點を當てたベンチャーファンドの投資家は、優(yōu)位性を獲得するために購入、構築、またはパートナーの重要な決定を下す必要がありますか?各オプションを評価する方法とpr

AGIとAIのスーパーインテリジェンスは、人間の天井の仮定の障壁に急激に衝突するでしょう AGIとAIのスーパーインテリジェンスは、人間の天井の仮定の障壁に急激に衝突するでしょう Jul 04, 2025 am 11:10 AM

それについて話しましょう。 革新的なAIブレークスルーのこの分析は、さまざまなインパクトのあるAIの複雑さの特定と説明など、最新のAIで進行中のForbes列のカバレッジの一部です(こちらのリンクを參照)。 アギに向かっています

Kimi K2:最も強力なオープンソースエージェントモデル Kimi K2:最も強力なオープンソースエージェントモデル Jul 12, 2025 am 09:16 AM

今年初めにゲナイ産業(yè)を混亂させたオープンソースの中國モデルの洪水を覚えていますか? Deepseekはほとんどの見出しを取りましたが、Kimi K1.5はリストの著名な名前の1つでした。そして、モデルはとてもクールでした。

AIからAGIへのパスでの大規(guī)模な知性の爆発を予測する AIからAGIへのパスでの大規(guī)模な知性の爆発を予測する Jul 02, 2025 am 11:19 AM

それについて話しましょう。 革新的なAIブレークスルーのこの分析は、さまざまなインパクトのあるAIの複雑さの特定と説明など、最新のAIで進行中のForbes列のカバレッジの一部です(こちらのリンクを參照)。 hの読者のために

Grok 4 vs Claude 4:どちらが良いですか? Grok 4 vs Claude 4:どちらが良いですか? Jul 12, 2025 am 09:37 AM

2025年半ばまでに、AIの「武器競爭」は熱くなり、Xaiと人類は両方ともフラッグシップモデルであるGrok 4とClaude 4をリリースしました。これら2つのモデルは、設計哲學と展開プラットフォームの反対側にありますが、

推論モデルのための考え方は長期的にはうまくいかないかもしれません 推論モデルのための考え方は長期的にはうまくいかないかもしれません Jul 02, 2025 am 11:18 AM

たとえば、モデルに「(x)人は(x)會社で何をしているのですか?」という質問をする場合、システムが必要な情報を取得する方法を知っていると仮定して、このようなものに見える推論チェーンを見るかもしれません:COの詳細を見つける

上院は、トランプの予算法案に押し込まれた10年間の州レベルのAI禁止を殺す 上院は、トランプの予算法案に押し込まれた10年間の州レベルのAI禁止を殺す Jul 02, 2025 am 11:16 AM

上院は、火曜日の朝99-1で投票して、擁護団體、議員、そしてそれを危険な行き過ぎと見なした何萬人ものアメリカ人からの土壇場の騒動の後、モラトリアムを殺しました。彼らは靜かにいませんでした。上院は聞いた

このスタートアップは、AIソフトウェアをテストするためにインドに病院を建設しました このスタートアップは、AIソフトウェアをテストするためにインドに病院を建設しました Jul 02, 2025 am 11:14 AM

臨床試験は醫(yī)薬品開発における膨大なボトルネックであり、キムとレディは、PI Healthで構築していたAI対応ソフトウェアが、潛在的に適格な患者のプールを拡大することでより速く、より安価にできると考えました。しかし、

See all articles