Reading CSV files is commonly implemented in Python using pandas library or csv module. 1. Use pandas to read through pd.read_csv(), return DataFrame, supports specifying parameters such as sep, header, index_col, encoding, na_values, etc., suitable for data analysis; 2. Use csv module to read line by line through csv.reader or csv.DictReader, the former returns a list, and the latter returns a dictionary, suitable for lightweight or no dependencies of third-party libraries; 3. Frequently asked questions: Use a complete path to avoid path errors, set encoding='gbk' or 'utf-8' to solve Chinese garbled code, and skiprows parameter skips specific rows. It is recommended to use pandas for daily analysis, and the script can handle optional csv modules, which are complete and flexible in operation.
Reading CSV files is a very common operation in Python. It is usually done using the pandas
library, or the csv
module in the standard library can be used. Here are a few practical examples suitable for different scenarios.

1. Read CSV using pandas (recommended)
import pandas as pd # Read CSV file df = pd.read_csv('data.csv') # Show the first few lines of data print(df.head())
illustrate:
-
pd.read_csv()
is the most commonly used method. - Supports automatic parsing of column names, processing missing values, specifying data types, etc.
- Returns a DataFrame for subsequent data analysis.
Common parameters:

-
sep=','
: Specify the delimiter (default is a comma). -
header=0
: Use the first row as the column name. -
index_col=None
: Do not specify an index column, and a column can also be set as an index. -
encoding='utf-8'
: Specify the encoding, commonly used when processing Chinese. -
na_values=['N/A', '']
: Customize missing value identification.
example:
df = pd.read_csv('data.csv', encoding='utf-8', na_values='NULL')
2. Read using csv module (standard library)
If you don't want to rely on third-party libraries, you can use the built-in csv
module in Python.

import csv with open('data.csv', mode='r', encoding='utf-8') as file: reader = csv.reader(file) for row in reader: print(row) # Each row is a list
If the CSV has headers, you can use DictReader:
import csv with open('data.csv', mode='r', encoding='utf-8') as file: reader = csv.DictReader(file) for row in reader: print(row) # Each row is a dictionary, the key is the column name
3. Handle FAQs
File path error?
Make sure the file is in the current working directory, or use the full path:df = pd.read_csv(r'C:\path\to\your\data.csv')
Chinese garbled?
Try different encodings:pd.read_csv('data.csv', encoding='gbk') # Commonly used in Chinese Windows systems
Skip certain lines?
pd.read_csv('data.csv', skiprows=1) # Skip the first line
Basically these common methods.
pandas
is recommended for daily analysis, which is simple and efficient;csv
modules are available for scripts or lightweight scenarios.The above is the detailed content of python read csv file example. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The settings.json file is located in the user-level or workspace-level path and is used to customize VSCode settings. 1. User-level path: Windows is C:\Users\\AppData\Roaming\Code\User\settings.json, macOS is /Users//Library/ApplicationSupport/Code/User/settings.json, Linux is /home//.config/Code/User/settings.json; 2. Workspace-level path: .vscode/settings in the project root directory

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

Use performance analysis tools to locate bottlenecks, use VisualVM or JProfiler in the development and testing stage, and give priority to Async-Profiler in the production environment; 2. Reduce object creation, reuse objects, use StringBuilder to replace string splicing, and select appropriate GC strategies; 3. Optimize collection usage, select and preset initial capacity according to the scene; 4. Optimize concurrency, use concurrent collections, reduce lock granularity, and set thread pool reasonably; 5. Tune JVM parameters, set reasonable heap size and low-latency garbage collector and enable GC logs; 6. Avoid reflection at the code level, replace wrapper classes with basic types, delay initialization, and use final and static; 7. Continuous performance testing and monitoring, combined with JMH

fixture is a function used to provide preset environment or data for tests. 1. Use the @pytest.fixture decorator to define fixture; 2. Inject fixture in parameter form in the test function; 3. Execute setup before yield, and then teardown; 4. Control scope through scope parameters, such as function, module, etc.; 5. Place the shared fixture in conftest.py to achieve cross-file sharing, thereby improving the maintainability and reusability of tests.

itertools.combinations is used to generate all non-repetitive combinations (order irrelevant) that selects a specified number of elements from the iterable object. Its usage includes: 1. Select 2 element combinations from the list, such as ('A','B'), ('A','C'), etc., to avoid repeated order; 2. Take 3 character combinations of strings, such as "abc" and "abd", which are suitable for subsequence generation; 3. Find the combinations where the sum of two numbers is equal to the target value, such as 1 5=6, simplify the double loop logic; the difference between combinations and arrangement lies in whether the order is important, combinations regard AB and BA as the same, while permutations are regarded as different;

DependencyInjection(DI)isadesignpatternwhereobjectsreceivedependenciesexternally,promotingloosecouplingandeasiertestingthroughconstructor,setter,orfieldinjection.2.SpringFrameworkusesannotationslike@Component,@Service,and@AutowiredwithJava-basedconfi

JavaFlightRecorder(JFR)andJavaMissionControl(JMC)providedeep,low-overheadinsightsintoJavaapplicationperformance.1.JFRcollectsruntimedatalikeGCbehavior,threadactivity,CPUusage,andcustomeventswithlessthan2%overhead,writingittoa.jfrfile.2.EnableJFRatsta

Python is an efficient tool to implement ETL processes. 1. Data extraction: Data can be extracted from databases, APIs, files and other sources through pandas, sqlalchemy, requests and other libraries; 2. Data conversion: Use pandas for cleaning, type conversion, association, aggregation and other operations to ensure data quality and optimize performance; 3. Data loading: Use pandas' to_sql method or cloud platform SDK to write data to the target system, pay attention to writing methods and batch processing; 4. Tool recommendations: Airflow, Dagster, Prefect are used for process scheduling and management, combining log alarms and virtual environments to improve stability and maintainability.
