Database automation with PgAgent.
If you have never worked with this application for PgAdmin, check out its features at this link: PgAgent documentation.
We write a script to add basic data to our main table. This script is written in the standard input window of PgAdmin.
copy nyse_data FROM 'C:\file_path\nyse_data.csv' DELIMITER ',' CSV HEADER; copy nyse_ticker_list FROM 'C:\file_path\nyse_ticker.txt' DELIMITER ',' CSV HEADER;
Check our data. At the time of writing, the number of shares must be greater than 2700 and the number of attributes must be at least 11 million.
You can delete file nyse_data.csv from your hard disk after inserting!
select count(*) from nyse_data; select * from nyse_ticker_list;
We have an empty table nyse_data_new to store new Yahoo Finance incoming data. This table will store all new quotes for the selected instruments. This is done to avoid duplicates in the main table. Since we use a daily interval, our table will store yesterday’s data.
We automate the addition of new data from the /IN python in folder to the temporary table nyse_data_new. To do this, we create a script inside PgAgent according to a certain script.
Suppose, the Python script parses new information at 3:11 am server time, let it take exactly 1 hour to parse new information and save it to a file. That means our script in PgAgent should run at about 5:40.
copy nyse_data_new FROM 'C:\file_path\nyse_data_new.csv' DELIMITER ',' CSV HEADER;
We created our first automated script in PL/pgSQL. Now every night at 5:40, the script will automatically take all new data from the nyse.csv file and add it to table nyse_data_new.
Now we need to link our main table nyse_data with our temporary nyse_data_new. To do this we will use a function and a trigger.
create or replace function nyse_trig() returns trigger AS $$ declare Date DATE; Symbols VARCHAR(10); Adj_Close NUMERIC(21, 15); Close NUMERIC(21, 15); High NUMERIC(21, 15); Low NUMERIC(21, 15); Open NUMERIC(21, 15); Volume NUMERIC(13, 2); BEGIN date = new.date; Symbols = new.symbols; Adj_close = new.adj_close; close = new.close; High = new.High; Low = new.Low; Open = new.Open; Volume = new.volume; if(date = CURRENT_DATE -1) then insert into nyse_data values( new.date, new.symbols, new.adj_close, new.close, new.High, new.Low, new.Open, new.volume ); end if; return new; END; $$ language plpgsql -- Trigger create trigger nyse_trig_on_insert after insert on nyse_data_new for each row execute procedure nyse_trig()
I will not explain all settings of these scripts because it will take a lot of time. If you want to know more about them, there are hundreds of PL/pgSQL function and trigger descriptions on the Internet. At this point, you can also choose which information you want to migrate from the temporary table to the main table. You can also find information on how to do this on the Internet or just think logically.
Now, we have to check if all tickers have updated.
Create an automatic script execution in PgAgent, for example at 6:30 am server time. The script compares our main data table with the list of tickers of the New York Stock Exchange.
The script automatically creates a list of tickers that are missing in the main table, also converting the information into an understandable language for the Python parser.
COPY (SELECT '[' || array_to_string(array_agg(''''||ticker||''','), ' ') || ']' FROM nyse_ticker_list WHERE ticker NOT IN (select symbols from nyse_data WHERE date = CURRENT_DATE-1 GROUP BY symbols)) TO 'C:\folder_name\file_name.csv';
After parsing the file again with the python parsing script, all we have to do is add the new data back into the temporary table. After that, our trigger and function will automatically pick up the necessary information for the main table.
This is done in exactly the same way we did the script for adding new information to the nyse_data_new, just change the execution time and file name.
As a result, we should have the following scheme.
|3:11 Am||Python starts downloading new DATA.|
|5:40 Am||PostgreSql add it to table nyse_data_new|
|6:30 Am||Checking and creating file with missing tickers.|
|7:03 Am||Python download missing data.|
|8:01 Am||PostgreSql add missing tickers in to nyse_data_new|
Remember this is just an example.
Wish you a great day and make your dream come thru.