GitHub repository: https://github.com/KsenoLv/Server/blob/Server/PostgreSQL/Tables.sql
In modern times, data is one of the most valuable resources. Managing data efficiently is critical to the success of any enterprise. Databases are a primary component in data management systems, and SQL is one of the most popular languages used for managing databases. In this article, we will discuss how to create a database using SQL, specifically for NYSE data.
File System of Our Database.
Before we start creating our database, let’s take a look at the file system we will be using:
- nyse_data.csv – main data content file
- nyse_data_new.csv – temporary data file
- nyse_ticker.txt – contains all NYSE tickers
Create table:
The first step in creating our database is to create tables. We will create two tables: nyse_ticker_list
and nyse_data
.
The schema of the nyse_ticker_list
table is as follows:
CREATE TABLE nyse_ticker_list(
ticker VARCHAR(5) NOT NULL,
name VARCHAR(100)
);
The schema of the nyse_data
and nyse_data_new
table is as follows:
CREATE TABLE nyse_data(
Date DATE NOT NULL
,Symbols VARCHAR(10) NOT NULL
,Adj_Close NUMERIC(21, 15) NOT NULL
,Close NUMERIC(21, 15) NOT NULL
,High NUMERIC(21, 15) NOT NULL
,Low NUMERIC(21, 15) NOT NULL
,Open NUMERIC(21, 15) NOT NULL
,Volume NUMERIC(13, 2) NOT NULL);
Inserting Data.
Now that we have created our tables, we can insert data into them. We will be inserting data into both the nyse_ticker_list
and nyse_data
tables.
Inserting data into nyse_ticker_list
:
COPY nyse_ticker_list FROM 'C:\file_path\nyse_ticker.txt' DELIMITER ',' CSV HEADER;
Inserting data into nyse_data
:
COPY nyse_data FROM 'C:\file_path\nyse_data.csv' DELIMITER ',' CSV HEADER;
Checking Data
Once we have inserted data into our tables, we can perform a quick check to ensure that the data has been inserted correctly. To do this, we can run the following query:
SELECT COUNT(*) FROM nyse_data;
This should return a count of more than 11 million, which is the number of rows in the nyse_data
table.
There are a few other things to keep in mind when working with databases:
- Checking for duplicates, NULL values, and incorrect data is essential for maintaining data integrity.
- To automate the process of inserting data, we can use libraries such as pandas_datareader.
- To save each ticker in a separate file, we can use the ticker name as the file name.
- We can use the schedule library to automate the process of downloading data.
Conclusion.
In this article, we discussed how to create a database using SQL for NYSE data. We went over the file system of our database, created tables, inserted data, and checked the data. We also discussed some other considerations to keep in mind when working with databases. By following these guidelines, we can manage data efficiently and maintain data integrity, which is critical for the success of any enterprise.
Leave a Reply