如何使用cURL下载文件完整指南

简介

cURL不仅是一个强大的HTTP客户端工具，也是一个优秀的文件下载工具。它支持多种协议（HTTP、HTTPS、FTP、SFTP等），提供了丰富的下载选项和功能。

本指南将详细介绍如何使用cURL下载各种类型的文件，包括基础下载、断点续传、批量下载、进度监控等高级功能，帮助你在不同场景下高效地完成文件下载任务。

基础文件下载

简单文件下载

最基本的文件下载命令：

# 下载文件到当前目录，保持原文件名
curl -O https://example.com/file.zip

# 下载文件并指定新文件名
curl -o myfile.zip https://example.com/file.zip

# 下载到指定目录
curl -o /downloads/file.zip https://example.com/file.zip

参数详解

-O (大写O)：保持远程文件名
-o (小写o)：指定本地文件名
-L：跟随重定向
-C -：断点续传

# 跟随重定向下载
curl -L -O https://github.com/user/repo/archive/main.zip

# 同时下载多个文件
curl -O https://example.com/file1.zip -O https://example.com/file2.zip

# 下载并显示详细信息
curl -v -O https://example.com/file.zip

进度监控和显示

显示下载进度

# 显示进度条
curl --progress-bar -O https://example.com/largefile.zip

# 静默模式（不显示进度）
curl -s -O https://example.com/file.zip

# 显示详细的下载统计信息
curl -w "Downloaded: %{size_download} bytes in %{time_total} seconds\nSpeed: %{speed_download} bytes/sec\n" -o file.zip https://example.com/file.zip

自定义进度格式

# 创建进度格式文件
cat > curl_format.txt << 'EOF'
     time_namelookup:  %{time_namelookup}s
        time_connect:  %{time_connect}s
     time_appconnect:  %{time_appconnect}s
    time_pretransfer:  %{time_pretransfer}s
       time_redirect:  %{time_redirect}s
  time_starttransfer:  %{time_starttransfer}s
                     ----------
          time_total:  %{time_total}s
       size_download:  %{size_download} bytes
      speed_download:  %{speed_download} bytes/sec
EOF

# 使用自定义格式
curl -w "@curl_format.txt" -o file.zip https://example.com/file.zip

断点续传

基础断点续传

# 自动检测并续传
curl -C - -O https://example.com/largefile.zip

# 从指定位置开始下载（假设已下载1024字节）
curl -C 1024 -O https://example.com/largefile.zip

# 结合其他选项使用断点续传
curl -L -C - --progress-bar -O https://example.com/largefile.zip

智能续传脚本

#!/bin/bash
# 智能下载脚本，支持断点续传和重试

download_with_retry() {
    local url="$1"
    local output="$2"
    local max_retries=3
    local retry_count=0
    
    while [ $retry_count -lt $max_retries ]; do
        echo "尝试下载 (第 $((retry_count + 1)) 次): $url"
        
        # 使用断点续传下载
        if curl -L -C - --progress-bar -o "$output" "$url"; then
            echo "下载成功: $output"
            return 0
        else
            echo "下载失败，等待5秒后重试..."
            sleep 5
            retry_count=$((retry_count + 1))
        fi
    done
    
    echo "下载失败，已重试 $max_retries 次"
    return 1
}

# 使用示例
download_with_retry "https://example.com/largefile.zip" "largefile.zip"

批量下载

URL列表批量下载

# 创建URL列表文件
cat > urls.txt << 'EOF'
https://example.com/file1.zip
https://example.com/file2.pdf
https://example.com/file3.mp4
https://example.com/file4.tar.gz
EOF

# 方法1：使用xargs批量下载
cat urls.txt | xargs -n 1 -P 4 curl -L -O

# 方法2：使用while循环
while read url; do
    echo "正在下载: $url"
    curl -L -O "$url"
done < urls.txt

# 方法3：并行下载脚本
#!/bin/bash
while read url; do
    {
        filename=$(basename "$url")
        echo "开始下载: $filename"
        curl -L -C - --progress-bar -o "$filename" "$url"
        echo "完成下载: $filename"
    } &
done < urls.txt
wait  # 等待所有下载完成

范围下载（下载文件的一部分）

# 下载文件的前1024字节
curl -r 0-1023 -o partial.zip https://example.com/file.zip

# 下载从第1024字节开始的内容
curl -r 1024- -o remaining.zip https://example.com/file.zip

# 下载文件的中间部分（第1024到第2047字节）
curl -r 1024-2047 -o middle.zip https://example.com/file.zip

# 下载文件的最后1024字节
curl -r -1024 -o last_part.zip https://example.com/file.zip

认证下载

基础认证

# HTTP基础认证
curl -u username:password -O https://example.com/protected/file.zip

# 只提供用户名，密码交互式输入
curl -u username -O https://example.com/protected/file.zip

# 使用环境变量存储凭据
export CURL_USER="username:password"
curl -u "$CURL_USER" -O https://example.com/protected/file.zip

Bearer Token认证

# 使用Bearer Token
curl -H "Authorization: Bearer your_token_here" -O https://api.example.com/files/document.pdf

# 从文件读取token
TOKEN=$(cat token.txt)
curl -H "Authorization: Bearer $TOKEN" -O https://api.example.com/files/document.pdf

# API密钥认证
curl -H "X-API-Key: your_api_key" -O https://api.example.com/files/document.pdf

Cookie认证

# 使用cookie文件
curl -b cookies.txt -O https://example.com/protected/file.zip

# 直接设置cookie
curl -b "sessionid=abc123; csrftoken=xyz789" -O https://example.com/protected/file.zip

# 保存cookie到文件
curl -c cookies.txt -b cookies.txt -O https://example.com/protected/file.zip

FTP下载

基础FTP下载

# 匿名FTP下载
curl -O ftp://ftp.example.com/pub/file.zip

# 使用用户名和密码
curl -u username:password -O ftp://ftp.example.com/private/file.zip

# 列出FTP目录内容
curl ftp://ftp.example.com/pub/

# 下载整个目录（递归）
curl -u username:password -O ftp://ftp.example.com/directory/

SFTP/SCP下载

# SFTP下载
curl -u username:password -O sftp://example.com/path/to/file.zip

# 使用SSH密钥
curl --key ~/.ssh/id_rsa --pubkey ~/.ssh/id_rsa.pub -u username -O sftp://example.com/file.zip

# SCP下载
curl -u username:password -O scp://example.com/path/to/file.zip

高级下载选项

速度限制和超时

# 限制下载速度为100KB/s
curl --limit-rate 100k -O https://example.com/file.zip

# 设置连接超时（10秒）
curl --connect-timeout 10 -O https://example.com/file.zip

# 设置最大传输时间（300秒）
curl --max-time 300 -O https://example.com/file.zip

# 结合使用多个选项
curl --limit-rate 500k --connect-timeout 30 --max-time 600 -C - -L -O https://example.com/largefile.zip

代理设置

# 使用HTTP代理
curl --proxy http://proxy.example.com:8080 -O https://example.com/file.zip

# 使用SOCKS代理
curl --socks5 socks5://proxy.example.com:1080 -O https://example.com/file.zip

# 代理认证
curl --proxy-user username:password --proxy http://proxy.example.com:8080 -O https://example.com/file.zip

# 使用环境变量设置代理
export http_proxy=http://proxy.example.com:8080
export https_proxy=http://proxy.example.com:8080
curl -O https://example.com/file.zip

SSL/TLS选项

# 忽略SSL证书验证（不推荐用于生产环境）
curl -k -O https://example.com/file.zip

# 指定CA证书文件
curl --cacert /path/to/ca-cert.pem -O https://example.com/file.zip

# 使用客户端证书
curl --cert client.pem --key client.key -O https://example.com/file.zip

# 指定SSL版本
curl --tlsv1.2 -O https://example.com/file.zip

文件完整性验证

下载并验证校验和

#!/bin/bash
# 下载文件并验证MD5校验和

download_and_verify() {
    local url="$1"
    local expected_md5="$2"
    local filename=$(basename "$url")
    
    echo "正在下载: $filename"
    curl -L -C - --progress-bar -o "$filename" "$url"
    
    if [ $? -eq 0 ]; then
        echo "下载完成，正在验证校验和..."
        actual_md5=$(md5sum "$filename" | cut -d' ' -f1)
        
        if [ "$actual_md5" = "$expected_md5" ]; then
            echo "✓ 校验和验证成功"
            return 0
        else
            echo "✗ 校验和验证失败"
            echo "期望: $expected_md5"
            echo "实际: $actual_md5"
            return 1
        fi
    else
        echo "下载失败"
        return 1
    fi
}

# 使用示例
download_and_verify "https://example.com/file.zip" "d41d8cd98f00b204e9800998ecf8427e"

SHA256验证脚本

#!/bin/bash
# 下载文件并验证SHA256校验和

verify_sha256() {
    local file="$1"
    local expected_sha256="$2"
    
    if [ ! -f "$file" ]; then
        echo "文件不存在: $file"
        return 1
    fi
    
    actual_sha256=$(sha256sum "$file" | cut -d' ' -f1)
    
    if [ "$actual_sha256" = "$expected_sha256" ]; then
        echo "✓ SHA256校验成功: $file"
        return 0
    else
        echo "✗ SHA256校验失败: $file"
        echo "期望: $expected_sha256"
        echo "实际: $actual_sha256"
        return 1
    fi
}

# 下载并验证
URL="https://example.com/file.zip"
EXPECTED_SHA256="e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
FILENAME=$(basename "$URL")

curl -L -C - -o "$FILENAME" "$URL" && verify_sha256 "$FILENAME" "$EXPECTED_SHA256"

下载自动化

监控下载脚本

#!/bin/bash
# 高级下载脚本，包含日志记录和错误处理

DOWNLOAD_DIR="./downloads"
LOG_FILE="download.log"
MAX_RETRIES=3
RETRY_DELAY=5

log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

download_file() {
    local url="$1"
    local output_dir="$2"
    local retry_count=0
    
    local filename=$(basename "$url")
    local output_path="$output_dir/$filename"
    
    # 创建输出目录
    mkdir -p "$output_dir"
    
    log "开始下载: $url"
    
    while [ $retry_count -lt $MAX_RETRIES ]; do
        log "尝试 #$((retry_count + 1)): $filename"
        
        # 检查是否存在部分下载的文件
        if [ -f "$output_path" ]; then
            local file_size=$(stat -c%s "$output_path" 2>/dev/null || echo 0)
            log "发现已存在文件，大小: $file_size 字节，尝试续传"
        fi
        
        # 执行下载
        if curl -L -C - --connect-timeout 30 --max-time 3600 \
                --progress-bar -o "$output_path" "$url" 2>>"$LOG_FILE"; then
            log "✓ 下载成功: $filename"
            
            # 验证文件大小
            local final_size=$(stat -c%s "$output_path")
            log "最终文件大小: $final_size 字节"
            
            return 0
        else
            retry_count=$((retry_count + 1))
            log "✗ 下载失败，等待 $RETRY_DELAY 秒后重试..."
            
            if [ $retry_count -lt $MAX_RETRIES ]; then
                sleep $RETRY_DELAY
            fi
        fi
    done
    
    log "✗ 下载失败，已达到最大重试次数: $filename"
    return 1
}

# 批量下载函数
batch_download() {
    local url_file="$1"
    local output_dir="$2"
    local success_count=0
    local total_count=0
    
    log "开始批量下载，URL文件: $url_file"
    
    while IFS= read -r url; do
        # 跳过空行和注释
        if [[ -z "$url" || "$url" =~ ^[[:space:]]*# ]]; then
            continue
        fi
        
        total_count=$((total_count + 1))
        
        if download_file "$url" "$output_dir"; then
            success_count=$((success_count + 1))
        fi
        
        log "进度: $success_count/$total_count 成功"
    done < "$url_file"
    
    log "批量下载完成: $success_count/$total_count 成功"
    
    if [ $success_count -eq $total_count ]; then
        return 0
    else
        return 1
    fi
}

# 使用示例
if [ $# -eq 0 ]; then
    echo "用法:"
    echo "  $0                     # 下载单个文件"
    echo "  $0 -f             # 批量下载"
    echo "  $0 -d             # 指定下载目录"
    exit 1
fi

case "$1" in
    -f)
        batch_download "$2" "$DOWNLOAD_DIR"
        ;;
    -d)
        download_file "$3" "$2"
        ;;
    *)
        download_file "$1" "$DOWNLOAD_DIR"
        ;;
esac

故障排除

常见问题和解决方案

1. 连接超时问题

# 增加超时时间和重试
curl --connect-timeout 60 --max-time 3600 --retry 3 --retry-delay 10 -L -C - -O https://example.com/file.zip

2. SSL证书问题

# 更新CA证书
curl --cacert /etc/ssl/certs/ca-certificates.crt -O https://example.com/file.zip

# 临时忽略证书验证（仅用于测试）
curl -k -O https://example.com/file.zip

3. 重定向问题

# 跟随重定向并限制重定向次数
curl -L --max-redirs 5 -O https://example.com/file.zip

4. 用户代理问题

# 设置常见的用户代理
curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" -O https://example.com/file.zip

调试技巧

# 详细输出调试信息
curl -v -O https://example.com/file.zip

# 只显示响应头
curl -I https://example.com/file.zip

# 跟踪重定向
curl -L -v -o /dev/null https://example.com/file.zip

# 保存调试信息到文件
curl -v -o file.zip https://example.com/file.zip 2> debug.log

最佳实践

1. 安全下载

安全建议：

始终验证下载文件的校验和
使用HTTPS而不是HTTP
避免禁用SSL验证
小心处理用户输入的URL
设置合理的超时时间

2. 性能优化

性能提示：

使用断点续传减少重复下载
合理设置并发下载数量
根据网络条件调整重试策略
使用进度监控了解下载状态
考虑使用代理加速下载

3. 脚本编写规范

#!/bin/bash
# 良好的下载脚本模板

set -euo pipefail  # 严格模式

# 配置变量
readonly SCRIPT_NAME=$(basename "$0")
readonly LOG_FILE="${SCRIPT_NAME%.sh}.log"
readonly DOWNLOAD_DIR="./downloads"
readonly MAX_RETRIES=3

# 错误处理
trap 'echo "脚本异常退出，请检查日志: $LOG_FILE" >&2' ERR

# 日志函数
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"
}

# 参数验证
validate_url() {
    local url="$1"
    if [[ ! "$url" =~ ^https?:// ]]; then
        log "错误: 无效的URL: $url"
        return 1
    fi
}

# 主要下载逻辑
main() {
    if [ $# -eq 0 ]; then
        echo "用法: $SCRIPT_NAME "
        exit 1
    fi
    
    local url="$1"
    validate_url "$url"
    
    # 下载逻辑...
}

main "$@"

总结

cURL是一个功能强大的文件下载工具，支持多种协议和高级功能。通过掌握本指南中的技巧，你可以：

高效地下载各种类型的文件
处理网络中断和连接问题
实现自动化的批量下载
确保下载文件的完整性和安全性
优化下载性能和用户体验

记住始终关注安全性、错误处理和用户体验，编写健壮可靠的下载脚本。合理使用cURL的各种选项，可以应对大多数文件下载场景的需求。