Skip to content

info for dataset download for windows users #22

@jpedataeditor

Description

@jpedataeditor

hi there

I'm managing team of about 20 who need to interact with the JPE dataverse. We've had some issues with the built-in zip downloader on the dataverse website, so I'm exploring alternatives. I wanted to get your opinion on the following things:

  • what's the best programmatic approach to download full datasets for windows user? would you recommend using your downloader script in something like WSL?
  • I have written some julia code to query the API directly. What's your view on the stability of this approach? (does your downloader use the same API - i.e. api/access/dataset/?) - would my approach below fail if the zip-downloading mechanism on the dataverse website is down? thanks!
"get an entire dataset from dv"
function dv_download_dataset(doi_or_url::String)

    persistent_id = dv_doi_from_url(doi_or_url)
    println(persistent_id)

    haskey(ENV,"JPE_DV") || error("You must set ENV var JPE_DV")

    dest = joinpath((root()), "replication-package" )

    # Construct the request URL
    url = "$(dvserver())/api/access/dataset/:persistentId/?persistentId=$(persistent_id)"
    
    # Set the request headers with the Dataverse API key
    headers = Dict("X-Dataverse-key" => ENV["JPE_DV"])
    
    # Send the GET request to the API
    response = HTTP.get(url, headers)
    
    # Handle the response (assuming the dataset is available for download)
    if response.status == 200
        @info "Dataset downloaded successfully."
        # Optionally, save the content to a file
        filename = "dataset_download.zip"  # Adjust this based on the file type you're downloading
        open(filename, "w") do f
            write(f, response.body)
        end
        @info "Dataset saved to: $filename"
        # Now unzip the file if it is a .zip file
        run(`unzip $filename -d $dest`)
        rm(filename)

    else
        @warn "Failed to fetch dataset. Status code: ", response.status
    end
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions