亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

目錄
Learning Objectives
Table of contents
Prerequisites
What is SAM 2?
Key Features of SAM 2
Core Components of SAM 2
Applications of SAM 2
What is Image Segmentation?
Setting Up and Utilizing SAM 2 for Image Segmentation
Step 1: Check GPU Availability and Set Up the Environment
Explanation
Step 2: Clone the SAM 2 Repository and Install Dependencies
Step 3: Download Model Checkpoints
Step 4: Download Sample Images
Step 5: Set Up the SAM 2 Model and Load an Image
Step 6: Visualize the Segmentation Masks
Step7: Use Box Prompts for Segmentation
Step8: Get Bounding Boxes and Perform Segmentation
Step9: Use Point Prompts for Segmentation
Key Points to Remember When Working SAM 2
Revolutionizing Photo and Video Editing
Real-Time Segmentation and Editing
Creative Possibilities for All
Automating Complex Tasks
Democratizing Content Creation
Impact on VFX Industry
Impressive Potential of SAM 2
Conclusion
Key Takeaways
Frequently Asked Questions
首頁 科技周邊 人工智能 使用SAM 2掌握圖像和視頻細分

使用SAM 2掌握圖像和視頻細分

Apr 14, 2025 am 10:16 AM

This guide will walk you through what ?Segment Anything Model 2? is, how it works, and how you’ll utilize it to portion objects in pictures and videos. It?offers state-of-the-art execution and adaptability in fragmenting objects into pictures, making it an important resource for a assortment of computer vision applications. This directly points to supplying a nitty-gritty, step-by-step walkthrough for setting up and utilizing SAM 2 to perform picture division. By taking this direct, you will be able to produce division covers for pictures utilizing both box and point prompts.

Learning Objectives

  • Describe the key features and applications of the Segment Anything Model 2 SAM 2 in image and video segmentation.
  • Successfully configure a CUDA-enabled environment, install necessary dependencies, and clone the Segment Anything Model 2 repository for image segmentation tasks.
  • Apply SAM 2 to generate segmentation masks for images using both box and point prompts and visualize the results effectively.
  • Evaluate how SAM 2 can revolutionize photo and video editing by enabling real-time segmentation, automating complex tasks, and democratizing content creation for a broader audience.

This article was published as a part of the?Data Science Blogathon.

Table of contents

  • Prerequisites
  • What is SAM 2?
  • Setting Up and Utilizing SAM 2 for Image Segmentation
  • Key Points to Remember When Working SAM 2
  • Impressive Potential of SAM 2
  • Conclusion
  • Frequently Asked Questions

Prerequisites

Some time recently you begin, guarantee you’ve got a CUDA-enabled GPU for quicker handling. Also, verify that you have Python installed on your machine. This guide assumes you have some basic knowledge of Python and image processing concepts.

What is SAM 2?

Segment Anything Model 2 is an progressed instrument for picture division created by Facebook AI Inquire about (Reasonable). On July 29th, 2024, Meta AI discharged SAM 2, an progressed picture and video division establishment show. SAM 2 empowers clients to supply focuses or boxes in an picture or video to create division covers for particular objects.

Click here to access it

Key Features of SAM 2

  • Advanced Mask Generation: SAM 2 generates high-quality segmentation masks based on user inputs, such as points or bounding boxes.
  • Flexibility: The model supports both image and video segmentation.
  • Speed and Efficiency: With CUDA support, SAM 2 can perform segmentation tasks rapidly, making it suitable for real-time applications.

Core Components of SAM 2

  • Image Encoder: Encodes the input image for processing.
  • Prompt Encoder: Converts user-provided points or boxes into a format the model can use.
  • Mask Decoder: Generates the final segmentation mask based on the encoded inputs.

Applications of SAM 2

Let us now look into the applications of SAM 2 below:

  • Photo and Video Editing: SAM 2 allows for precise object segmentation, enabling detailed edits and creative effects in photos and videos.
  • Autonomous Vehicles: In autonomous driving, SAM 2 can be used to identify and track objects like pedestrians, vehicles, and road signs in real-time.
  • Medical Imaging: SAM 2 can assist in segmenting anatomical structures in medical images, aiding in diagnostics and treatment planning.

What is Image Segmentation?

Image segmentation is a computer vision technique that involves dividing an image into multiple segments or regions to simplify its analysis. Each segment represents a different object or part of an object within the image, making it easier to identify and analyze specific elements.

Types of Image Segmentation

  • Semantic Segmentation: Classifies each pixel into a predefined category.
  • Instance Segmentation: Differentiates between different instances of the same object category.
  • Panoptic Segmentation: Combines semantic and instance segmentation.

Setting Up and Utilizing SAM 2 for Image Segmentation

We’ll guide you through the process of setting up the Segment Anything Model 2 (SAM 2) in your environment and utilizing its powerful capabilities for precise image segmentation tasks. From ensuring your GPU is ready to configuring the model and applying it to real images, each step will be covered in detail to help you harness the full potential of SAM 2.

Step 1: Check GPU Availability and Set Up the Environment

First, let’s ensure that your environment is properly set up, starting with checking for GPU availability and setting the current working directory.

# Check GPU availability and CUDA version
!nvidia-smi
!nvcc --version

# Import necessary modules
import os

# Set the current working directory
HOME = os.getcwd()
print("HOME:", HOME)

Explanation

  • !nvidia-smi and !nvcc –version: These commands check if your framework incorporates a CUDA-enabled GPU and show the CUDA form.
  • os.getcwd(): This work gets the current working catalog, which can be utilized for overseeing record ways.

Step 2: Clone the SAM 2 Repository and Install Dependencies

Next, we need to clone the SAM 2 repository from GitHub and install the required dependencies.

# Clone the SAM 2 repository
!git clone https://github.com/facebookresearch/segment-anything-2.git

# Change to the repository directory
%cd segment-anything-2

# Install the SAM 2 package
!pip install -e .

# Install additional packages
!pip install supervision jupyter_bbox_widget

Explanation

  • !git clone: Clones the SAM 2 repository to your local machine.
  • %cd: Changes the directory to the cloned repository.
  • !pip install -e .: Installs the SAM 2 package in editable mode.
  • !pip install supervision jupyter_bbox_widget: Installs additional packages required for visualization and bounding box widget support.

Step 3: Download Model Checkpoints

Model checkpoints are essential, as they contain the trained parameters of SAM 2. We will download multiple checkpoints for different model sizes.

# Create a directory for checkpoints
!mkdir -p checkpoints

# Download the model checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_tiny.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_small.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_base_plus.pt -P checkpoints
!wget -q https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_large.pt -P checkpoints

Explanation

  • !mkdir -p checkpoints: Creates a directory for storing model checkpoints.
  • !wget -q … -P checkpoints: Downloads the model checkpoints into the checkpoints directory. Different checkpoints represent models of varying sizes and capabilities.

Step 4: Download Sample Images

For demonstration purposes, we’ll use some sample images. You can also use your images by following similar steps.

# Create a directory for data
!mkdir -p data

# Download sample images
!wget -q https://media.roboflow.com/notebooks/examples/dog.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-2.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-3.jpeg -P data
!wget -q https://media.roboflow.com/notebooks/examples/dog-4.jpeg -P data

Explanation

  • !mkdir -p data: Creates a directory for storing sample images.
  • !wget -q … -P data: Downloads the sample images into the data directory.

Step 5: Set Up the SAM 2 Model and Load an Image

Now, we will set up the SAM 2 model, load an image, and prepare it for segmentation.

import cv2
import torch
import numpy as np
import supervision as sv

from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor
from sam2.automatic_mask_generator import SAM2AutomaticMaskGenerator

# Enable CUDA if available
torch.autocast(device_type="cuda", dtype=torch.bfloat16).__enter__()

if torch.cuda.get_device_properties(0).major >= 8:
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cudnn.allow_tf32 = True

# Set the device to CUDA
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Define the model checkpoint and configuration
CHECKPOINT = "checkpoints/sam2_hiera_large.pt"
CONFIG = "sam2_hiera_l.yaml"

# Build the SAM 2 model
sam2_model = build_sam2(CONFIG, CHECKPOINT, device=DEVICE, apply_postprocessing=False)

# Create the automatic mask generator
mask_generator = SAM2AutomaticMaskGenerator(sam2_model)

# Load an image for segmentation
IMAGE_PATH = "/content/WhatsApp Image 2024-08-02 at 14.17.11_2b223e01.jpg"
image_bgr = cv2.imread(IMAGE_PATH)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

# Generate segmentation masks
sam2_result = mask_generator.generate(image_rgb)

Explanation

  • CUDA Setup: Enables CUDA for faster processing and sets the device to GPU if available.
  • Model Setup: Builds the SAM 2 model using the specified configuration and checkpoint.
  • Image Loading: Loads and converts the sample image to RGB format.
  • Mask Generation: Uses the automatic mask generator to generate segmentation masks for the loaded image.

Step 6: Visualize the Segmentation Masks

We will now visualize the segmentation masks generated by SAM 2.

# Annotate the masks on the image
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)
detections = sv.Detections.from_sam(sam_result=sam2_result)
annotated_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the original and segmented images side by side
sv.plot_images_grid(
    images=[image_bgr, annotated_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

使用SAM 2掌握圖像和視頻細分

# Extract and plot individual masks
masks = [
    mask['segmentation']
    for mask in sorted(sam2_result, key=lambda x: x['area'], reverse=True)
]

sv.plot_images_grid(
    images=masks[:16],
    grid_size=(4, 4),
    size=(12, 12)
)

使用SAM 2掌握圖像和視頻細分

Explanation:

  • Mask Annotation: Annotates the segmentation masks on the original image.
  • Visualization: Plots the original and segmented images side by side and also plots individual masks.

Step7: Use Box Prompts for Segmentation

Box prompts allow us to specify regions of interest in the image for segmentation.

# Define the SAM 2 Image Predictor
predictor = SAM2ImagePredictor(sam2_model)

# Reload the image
image_bgr = cv2.imread(IMAGE_PATH)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

# Encode the image for bounding box input
import base64

def encode_image(filepath):
    with open(filepath, 'rb') as f:
        image_bytes = f.read()
    encoded = str(base64.b64encode(image_bytes), 'utf-8')
    return "data:image/jpg;base64,"+encoded

# Enable custom widget manager in Colab
IS_COLAB = True

if IS_COLAB:
    from google.colab import output
    output.enable_custom_widget_manager()

from jupyter_bbox_widget import BBoxWidget

# Create a bounding box widget
widget = BBoxWidget()
widget.image = encode_image(IMAGE_PATH)

# Display the widget
widget

使用SAM 2掌握圖像和視頻細分

Explanation

  • Image Predictor: Defines the SAM 2 image predictor.
  • Image Encoding: Encodes the image for use with the bounding box widget.
  • Widget Setup: Sets up a bounding box widget for specifying regions of interest.

Step8: Get Bounding Boxes and Perform Segmentation

After specifying the bounding boxes, we can use them to generate segmentation masks.

# Get the bounding boxes from the widget
boxes = widget.bboxes
boxes = np.array([
    [
        box['x'],
        box['y'],
        box['x'] + box['width'],
        box['y'] + box['height']
    ] for box in boxes
])
[{'x': 457, 'y': 341, 'width': 0, 'height': 0, 'label': ''},
 {'x': 205, 'y': 79, 'width': 0, 'height': 1, 'label': ''}]
# Set the image in the predictor
predictor.set_image(image_rgb)

# Generate masks using the bounding boxes
masks, scores, logits = predictor.predict(
    box=boxes,
    multimask_output=False
)

# Convert masks to binary format
masks = np.squeeze(masks)

# Annotate and visualize the masks
box_annotator = sv.BoxAnnotator(color=sv.Color.white())
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)

detections = sv.Detections(
    xyxy=boxes,
    mask=masks.astype(bool)
)

source_image = box_annotator.annotate(scene=image_bgr.copy(), detections=detections)
segmented_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the annotated images
sv.plot_images_grid(
    images=[source_image, segmented_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

使用SAM 2掌握圖像和視頻細分

Explanation

  • Bounding Boxes: Retrieves the bounding boxes specified using the widget.
  • Mask Generation: Uses the bounding boxes to generate segmentation masks.
  • Visualization: Annotates and visualizes the masks on the original image.

Step9: Use Point Prompts for Segmentation

Point prompts allow us to specify individual points of interest for segmentation.

# Create point prompts based on bounding boxes
input_point = np.array([
    [
        box['x'] + (box['width'] // 2),
        box['y'] + (box['height'] // 2)
    ] for box in widget.bboxes
])
input_label = np.array([1] * len(input_point))

# Generate masks using the point prompts
masks, scores, logits = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True
)

# Convert masks to binary format
masks = np.squeeze(masks)

# Annotate and visualize the masks
point_annotator = sv.PointAnnotator(color_lookup=sv.ColorLookup.INDEX)
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)

detections = sv.Detections(
    xyxy=sv.mask_to_xyxy(masks=masks),
    mask=masks.astype(bool)
)

source_image = point_annotator.annotate(scene=image_bgr.copy(), detections=detections)
segmented_image = mask_annotator.annotate(scene=image_bgr.copy(), detections=detections)

# Plot the annotated images
sv.plot_images_grid(
    images=[source_image, segmented_image],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

使用SAM 2掌握圖像和視頻細分

Explanation

  • Point Prompts: Creates point prompts based on the bounding boxes.
  • Mask Generation: Uses the point prompts to generate segmentation masks.
  • Visualization: Annotates and visualizes the masks on the original image.

Key Points to Remember When Working SAM 2

Let us now look into few important key points below:

Revolutionizing Photo and Video Editing

  • Potential to transform the photo and video editing industry.
  • Future enhancements may include improved precision, lower computational requirements, and advanced AI integration.

Real-Time Segmentation and Editing

  • Evolution could lead to real-time segmentation and editing capabilities.
  • Allows seamless alterations in videos and images with minimal effort.

Creative Possibilities for All

  • Opens up new creative possibilities for both professionals and amateurs.
  • Simplifies the manipulation of visual content, the creation of stunning effects, and the production of high-quality media.

Automating Complex Tasks

  • Automates intricate segmentation tasks.
  • Significantly accelerates workflows, making sophisticated editing more accessible and efficient.

Democratizing Content Creation

  • Makes high-level editing tools available to a broader audience.
  • Empowers storytellers and inspires innovation across various sectors, including entertainment, advertising, and education.

Impact on VFX Industry

  • Enhances visual effects (VFX) production by streamlining complex processes.
  • Reduces the time and effort required for creating intricate VFX, enabling more ambitious projects and improving overall quality.

Impressive Potential of SAM 2

The Segment Anything Model 2 (SAM 2) stands poised to revolutionize the fields of photo and video editing by introducing significant advancements in precision and computational efficiency. By integrating advanced AI capabilities, SAM 2 will enable more intuitive user interactions and real-time segmentation and editing, allowing seamless alterations with minimal effort. This groundbreaking technology promises to democratize content creation, empowering both professionals and amateurs to manipulate visual content, create stunning effects, and produce high-quality media with ease.

As SAM 2 automates complex segmentation tasks, it will accelerate workflows and make sophisticated editing accessible to a wider audience. This transformation will inspire innovation across various industries, from entertainment and advertising to education. In the realm of visual effects (VFX), SAM 2 will streamline intricate processes, reducing the time and effort needed to create elaborate VFX. This will enable more ambitious projects, elevate the quality of visual storytelling, and open up new creative possibilities in the VFX world.

Conclusion

By following this guide, you have learned how to set up and use the Segment Anything Model 2 (SAM 2) for image segmentation using both box and point prompts. SAM 2 provides powerful and flexible tools for segmenting objects in images, making it a valuable asset for various computer vision tasks. Feel free to experiment with your images and explore the capabilities of SAM 2 further.

Key Takeaways

  • SAM 2 is an advanced tool developed by Meta AI that enables precise and flexible image and video segmentation using both box and point prompts.
  • The model can significantly enhance photo and video editing by automating complex segmentation tasks, making it more accessible and efficient.
  • Setting up SAM 2 requires a CUDA-enabled GPU and a basic understanding of Python and image processing concepts.
  • SAM 2’s capabilities open new possibilities for both professionals and amateurs in content creation, offering real-time segmentation and creative control.
  • The model has the potential to transform various industries, including visual effects, entertainment, advertising, and education, by democratizing high-level editing tools.

Frequently Asked Questions

Q1. What is SAM 2?

A. SAM 2, or Section Anything Show 2, is a?picture and video division show created by Meta AI that permits clients to produce division covers for particular objects by giving box or point prompts.

Q2. What are the prerequisites for utilizing SAM 2?

A. To use SAM 2, you need a CUDA-enabled GPU for faster processing and Python installed on your machine. Basic knowledge of Python and image processing concepts is also helpful.

Q3. How do I set up SAM 2?

A. Set up SAM 2 by checking GPU availability, cloning the SAM 2 repository from GitHub, installing required dependencies, and downloading model checkpoints and sample images for testing.

Q4. What types of prompts can be used with SAM 2 for segmentation?

A. SAM 2 supports both box prompts and point prompts. Box prompts involve specifying regions of interest using bounding boxes, while point prompts involve selecting specific points in the image.

Q5. How can SAM 2 impact photo and video editing?

A. SAM 2 can revolutionize photo and video altering by mechanizing complex division assignments, empowering real-time altering, and making advanced altering apparatuses available to a broader gathering of people, in this manner improving imaginative conceivable outcomes and workflow proficiency.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

以上是使用SAM 2掌握圖像和視頻細分的詳細內(nèi)容。更多信息請關(guān)注PHP中文網(wǎng)其他相關(guān)文章!

本站聲明
本文內(nèi)容由網(wǎng)友自發(fā)貢獻,版權(quán)歸原作者所有,本站不承擔(dān)相應(yīng)法律責(zé)任。如您發(fā)現(xiàn)有涉嫌抄襲侵權(quán)的內(nèi)容,請聯(lián)系admin@php.cn

熱AI工具

Undress AI Tool

Undress AI Tool

免費脫衣服圖片

Undresser.AI Undress

Undresser.AI Undress

人工智能驅(qū)動的應(yīng)用程序,用于創(chuàng)建逼真的裸體照片

AI Clothes Remover

AI Clothes Remover

用于從照片中去除衣服的在線人工智能工具。

Clothoff.io

Clothoff.io

AI脫衣機

Video Face Swap

Video Face Swap

使用我們完全免費的人工智能換臉工具輕松在任何視頻中換臉!

熱工具

記事本++7.3.1

記事本++7.3.1

好用且免費的代碼編輯器

SublimeText3漢化版

SublimeText3漢化版

中文版,非常好用

禪工作室 13.0.1

禪工作室 13.0.1

功能強大的PHP集成開發(fā)環(huán)境

Dreamweaver CS6

Dreamweaver CS6

視覺化網(wǎng)頁開發(fā)工具

SublimeText3 Mac版

SublimeText3 Mac版

神級代碼編輯軟件(SublimeText3)

AI投資者停滯不前? 3條購買,建造或與人工智能供應(yīng)商合作的戰(zhàn)略途徑 AI投資者停滯不前? 3條購買,建造或與人工智能供應(yīng)商合作的戰(zhàn)略途徑 Jul 02, 2025 am 11:13 AM

投資蓬勃發(fā)展,但僅資本還不夠。隨著估值的上升和獨特性的衰落,以AI為中心的風(fēng)險投資的投資者必須做出關(guān)鍵決定:購買,建立或合作伙伴才能獲得優(yōu)勢?這是評估每個選項和PR的方法

AGI和AI超級智能將嚴重擊中人類天花板的假設(shè)障礙 AGI和AI超級智能將嚴重擊中人類天花板的假設(shè)障礙 Jul 04, 2025 am 11:10 AM

讓我們來談?wù)劇? 對創(chuàng)新AI突破的分析是我正在進行的AI中正在進行的福布斯列覆蓋的一部分,包括識別和解釋各種有影響力的AI復(fù)雜性(請參閱此處的鏈接)。 前往Agi和

Kimi K2:最強大的開源代理模型 Kimi K2:最強大的開源代理模型 Jul 12, 2025 am 09:16 AM

還記得今年早些時候破壞了Genai行業(yè)的大量開源中國模型嗎?盡管DeepSeek占據(jù)了大多數(shù)頭條新聞,但Kimi K1.5是列表中的重要名字之一。模型很酷。

未來預(yù)測從AI到AGI的道路上的大規(guī)模情報爆炸 未來預(yù)測從AI到AGI的道路上的大規(guī)模情報爆炸 Jul 02, 2025 am 11:19 AM

讓我們來談?wù)劇? 對創(chuàng)新AI突破的分析是我正在進行的AI中正在進行的福布斯列覆蓋的一部分,包括識別和解釋各種有影響力的AI復(fù)雜性(請參閱此處的鏈接)。對于那些讀者

Grok 4 vs Claude 4:哪個更好? Grok 4 vs Claude 4:哪個更好? Jul 12, 2025 am 09:37 AM

到2025年中期,AI“軍備競賽”正在加熱,XAI和Anthropic都發(fā)布了他們的旗艦車型Grok 4和Claude 4。這兩種模型處于設(shè)計理念和部署平臺的相反端,但他們卻在

推理模型的思想鏈可能無法長期解決 推理模型的思想鏈可能無法長期解決 Jul 02, 2025 am 11:18 AM

例如,如果您向模型提出一個問題,例如:“(x)人在(x)公司做什么?”您可能會看到一個看起來像這樣的推理鏈,假設(shè)系統(tǒng)知道如何檢索必要的信息:找到有關(guān)CO的詳細信息

參議院殺死了特朗普的預(yù)算法案中的10年州AI禁令 參議院殺死了特朗普的預(yù)算法案中的10年州AI禁令 Jul 02, 2025 am 11:16 AM

參議院星期二早上以99-1投票,殺死了暫停,在倡導(dǎo)團體,立法者和成千上萬的美國人中的最后一分鐘的騷動中,他們將其視為危險的過度。他們沒有保持安靜。參議院傾聽。國家保持

這家初創(chuàng)公司在印度建立了一家醫(yī)院來測試其AI軟件 這家初創(chuàng)公司在印度建立了一家醫(yī)院來測試其AI軟件 Jul 02, 2025 am 11:14 AM

臨床試驗是藥物開發(fā)中的巨大瓶頸,Kim和Reddy認為他們在PI Health建立的AI-Spainite軟件可以通過擴大潛在符合條件的患者的庫來更快,更便宜。但是

See all articles