Challenge 9: Processing Invoices at the End of a Quarter
Level: Hard
Description: You’re a data analyst supporting various Finance departments in a big company, and your main goal is to upgrade their processes with the latest tools and technology. It's January 2nd, and all supplier invoices from the previous quarter need to be processed. Can you assist the finance department and find ways to:
1. Read 500+ e-invoices in XML format and extract relevant data;
2. Assist with the creation of a management reporting package;
3. Assist with performing internal controls -- i.e., taking into consideration two datasets provided by the Procurement department you should identify any invoices that (a) do not have a PO Number, (b) do not match a PO Number issued by the company, and (c) have an invoice date which is before the corresponding Purchase Order Date;
4. Assist with a report to be provided to Financial Planning & Analysis to inform their forecast. The report should show any amounts remaining on a purchase order where not all items were invoiced in full. To determine the timing of invoicing in the future, assume that all remaining items on a Purchase Order will be invoiced at once exactly 14 months after the first invoice date. The report should also outline the top 3 suppliers by remaining amount on invoices, as well as the top 3 products. Note: There are two ways that Purchase Orders can have been partially invoiced -- it is possible that, e.g., out of 5 line items, only 2 have been fully invoiced; and it is also possible that line items that have a quantity larger than 1 have only partially invoiced (e.g., 5 units were ordered on item 1 and only 3 were invoiced).
Beginner-friendly objectives: 1. Extract and preprocess data from XML and table files, ensuring the data is clean and ready for analysis; 2. Perform basic data transformations, such as converting string data to numerical and date formats.
Intermediate-friendly objectives: 1. Implement data aggregation and filtering techniques to summarize and refine the dataset for further analysis; 2. Create visualizations to represent the aggregated data, focusing on key metrics and insights.
Advanced objectives: 1. Integrate multiple data sources and perform complex joins to enrich the dataset with additional information; 2. Develop a comprehensive report that includes all visualizations and insights, ready for presentation or further analysis.