Introduction to Data Science for Business Decision-making - Financial Management and Business Data Analytics | CMA Inter Syllabus
Table of Content
CMA Inter Blogs :
There is a saying ‘data is the new oil’. Over the last few years, with the advent of increasing computing power and availability of data, the importance and application of data science has grown exponentially. The field of finance and accounts has not remained untouched from this wave. In fact, to become an effective finance and accounts professional, it is very important to understand, analyse and evaluate data sets.
1.1 What is data and how it is linked to information and knowledge?
Data is a source of information and information needs to be processed for gathering knowledge. Any ‘data’ on its own does not confer any meaning. The relationship between data, information, and knowledge may be depicted from below:
The idea of data in the syllabus is frequently described to as ‘raw’ data, which is a collection of meaningless text, numbers, and symbols. The example of ‘raw data’ could be as below:
The above shows few data series. It is almost impossible to decipher, what these data series is talking about. The reason is that we do not know the exact context of these data. The first series may be a multiplication table of 2. Alternatively, this series may also be the marks obtained by students in a class test with full marks of 20. The second series names few Indian brands, but we don’t know, why the names are uttered here at all. To cut the long story short, we must know the context in which the raw data is talking about. Any ‘data’ on its own can’t convey any information.
1.2 What is information?
As we discussed, data needs to processed for gathering information. Most commonly, we take the help of computers and software packages for processing data. An exponential growth in availability of computing powers, and software packages lead to growth of data science in recent years.
If we say that the first series is really the first four numbers of multiplication table of 2, the third series is the highest temperature of Kolkata during previous four days, we are actually discovering some information out of the raw data., we may say now
1.3 What is knowledge?
When these ‘information’ is used for solving a problem, we say it’s the use of knowledge. By having the information, about highest temperatures in Kolkata for a month, we may try to estimate the sale of air conditioners. If our intention is to analyse the profitability of listed FMCG companies in India, first information we should have been the names of FMCG companies. So, we may say:
1.4 Nature of Data
Over the time the magnitude and availability of data has exponentially grown over the years. However, the data sets may be classified into different groups as below:(i) Numerical data: Any data expressed as a number is a numerical data. In finance, a prominent example is stock price data. Figure 8.5 below is showing the daily stock prices of HUL stock. This is an example of numerical data.
(ii) Descriptive data: Some times information may be deciphered in the form of qualitative information. Look at the paragraph in figure 8.6 extracted from annual report of HUL (2021-22). This is a descriptive data provided by HUL in its annual report (2021-22). The user may use this data to make a judicious investment decision.
Leading social and environment change
● At Hindustan Unilever, we have always strived to grow our business while protecting the planet and doing good for the people. We believe that to generate superior long-term value, we need to care for all our stakeholders – our consumers, customers, employees, shareholders and above all, the planet and society. We call it the multistakeholder model of sustainable growth. With more people entering the consuption cycle and adding to the pressure on natural resources, it will become even more important to decouple growth from environmental impact and drive positive social change.
(iii) Graphic data: A picture or graphic may tell thousand stories. Data may also be presented in the form of a picture or graphics. For example, the stock price of HUL may be presented in the form of a picture or chart
Data plays a very important role in the study of finance and cost accounting. From the inception of the study of finance, accounting and cost accounting, data always played an important role. Be it in the form of financial statements, or cost statements etc the finance and accounting professionals played a significant role in helping the management to make prudent decisions.The kinds of data used in finance and costing may be quantitative as well as qualitative in nature.
Types of data
There is another way of classifying the types of data. The data may be classified also as:
(i) Nominal
(ii) Ordinal
(iii) Interval
(vi) Ratio
Each gives a distinct set of traits that influences the sort of analysis that may be conducted. The differentiation between the four scale types is based on three basic characteristics:
(a) Whether the sequence of answers matters or not
(b) Whether the gap between observations is significant or interpretable, and
(c) The existence or presence of a genuine zero.
We will briefly discuss these four types below:
(i) Nominal Scale: Nominal scale is being used for categorising data. Under this scale, observations are classified based on certain characteristics. The category labels may contain numbers but have no numerical value. Examples could be, classifying equities into small-cap, mid-cap, and large-cap categories or classifying funds as equity funds, debt funds, and balanced funds etc.
(ii) Ordinal Scale: Ordinal scale is being used for classifying and put it in order. The numbers just indicate an order. They do not specify how much better or worse a stock is at a specific price compared to one with a lower price. For example, the top 10 stocks by P/E ratio
(iii) Interval scale: Interval scale is used for categorising and ranking using an equal interval scale. Equal intervals separate neighbouring scale values. As a result of scale’s arbitrary zero point, ratios cannot be calculated. For example, temperature scales. The temperature of 40 degrees is 5 degrees higher than that of 35 degrees. The issue is that a temperature of 0 degrees Celsius does not indicate the absence of temperature. A temperature of 20 degrees is thus not always twice as hot as a temperature of 10 degrees.
(iv) Ratio scale: The ratio scale possesses all characteristics of the nominal, ordinal, and interval scales. The acquired data can not only be classified and rated on a ratio scale, but also have equal intervals. A ratio scale has a true zero, meaning that zero has a significant value. The genuine zero value on a ratio scale allows for the magnitude to be described. For example, length, time, mass, money, age, etc. are typical examples of ratio scales. For data analysis, a ratio scale may be utilised to measure sales, pricing, market share, and client count.
In plain terms, digitization implies the process of converting the data and information from analogue to digital format. The data in the original form may be stored in as an object, a document or an image. The objective of digitization is to create a digital surrogate of the data and information in the form of binary numbers that facilitate processing using computers. There are primarily two basic objectives of digitization. First is to provide a widespread access of data and information to a very large group of users simultaneously. Secondly, digitization helps in preservation of data for a longer period. One of largest digitization project taken up in India is Unique Identification number’ (UID) or ‘Aadhar
Digitization brings in some great advantages, which are mentioned below.
Why we digitize?
There are many arguments that favour digitization of records. Some of them are mentioned below:
● Improves classification and indexing for documents, this helps in retrieval of the records
● Digitized records may be accessed by more than one person simultaneously.
● It becomes easier to reuse the data, which are difficult to reuse in present format e.g. very large maps, datarecorded in microfilms etc.
● Helps in work processing
● Higher integration with business information systems
● Easier to keep back-up files and retrieval during any unexpected disaster
● Can be accessed from multiple locations through networked systems
● Increased scope for rise in organizational productivity
● Requires less physical storage space
How do we digitize?
Large institution takes up digitization projects with meticulous planning and execution. The entire process of digitization may be segregated into six phases:
1: Justification of the proposed digitization project
At the very initiation of the digitization project, the accrual benefit of the project needs to be identified. Also need to compute the cost aspect of the project and the assessment of availability of resources. Risk assessment is an important part project assessment. For the resources that may be facing quick destruction may be required an early digitization.Most importantly, the expected value generation through digitization should be expressed in clear terms.
2: Assessment
In any institutions, all records are never digitized. The data that requires digitization is to be decided on the basis of content and context. Some data may be digitized in a consolidated format, and some in detailed format. The files, tables, documents, expected future use etc are to be accessed and evaluated for the assessment.The hardware and software requirements for digitization is also assessed at this stage. The human resource requirement for executing the digitization project is also pl anned. The risk assessment at this level e.g. possibilities of natural disasters, and/or cyber attacks etc also need to be completed.
3: Planning
Successful execution of digitization project needs meticulous planning. There are several stages for planning e.g. selection of digitization approach, Project documentation, Resources management, Technical specifications, and Risk management.
The institution may decide to complete the digitization in-house or alternatively by an outsourced agency. It may also be done on-demand or in batches.
4: Digitization activities
Upon the completion of assessment and planning phase, the digitization activities start. The Wisconsin Historical Society developed a six-phase process viz. Planning, Capture, Primary quality control, Editing, Secondary quality control, and storage and management.The planning schedule is prepared at the fist stage, calibration of hardware/software and scanning etc is done next. A primary quality check is done on the output to check the reliability. Cropping, colour correction, assigning Metadata etc is done at the editing stage. A final check of quality is done on randomly selected samples. And finally, user copies are created, and uploaded to dedicated storage space, after doing file validation. The digitization process may be viewed at
The complete digitization process. Source: Bandi, S., Angadi, M. and Shivarama, J. Best practices in digitization: Planning and workflow processes. In Proceedings of the Emerging Technologies and Future of Libraries: Issues and Challenges (Gulbarga University, Karnataka, India, 30-31 January), 2015
5: Processes in the care of records
Once the digitization of records is complete, there are few additional requirements arise which may be linked to administration of records. The permission for accession of data, intellectual control (over data), classification (if necessary), and upkeeping and maintenance of data are few additional requirements for data management.
6: Evaluation
Once the digitization project is updated and implemented, the final phase should be a systematic determination of the project’s merit, worth and significant using objective criteria. The primary purpose is to enable reflection and assist identify changes that would improve future digitization processes.
The emergence of big data has changed the world of business like never before. The most important shift has happened in the information generation and the decision-making process. There is a strong emergence of analytics that supports a more intensive data-centric and data-driven information generation and decision-making process. The data that encompasses the organization is being harnessed into information that apprises, cares and prudent decision making in a judicious and repeatable manner.The pertinent question here is, What an enterprise needs to do for transforming data into relevant information? As noted earlier, all types of data may not lead to relevant information for decision making. The biannual KPMG
global CFO report says, for today’s finance function leaders, “biggest challenges lie in creating the efficiencies needed to gather and process basic financial data and continue to deliver traditional finance outputs while at the same time redeploying their limited resources to enable higher-value business decision support activities.” For understating the finance functions within an enterprise, we may refer
At the ‘basics’ or foundation of pyramid the data generation may be automated by using ERP and other relevant software and hardware tools. The tools, techniques and processes that comprise the field of data & analytics (D&A) play a significant role in improving the quality of standard daily data and transaction processing. To make the data turn into user friendly information, it should go through six core steps:
1.Collection of data: The collection of data may be done with standardized systems in place. Appropriate software and hardware may be used for this purpose. Appointment of trained staff also plays an important role in collecting accurate and relevant data.
2. Organising the data: The raw data needs to be organized in an appropriate manner to generate relevant information. The data may be grouped, arranged in a manner that create useful information for the target user groups.
3. Data processing: At this step, data needs to be cleaned to remove the unnecessary elements. If any data point is missing or not available, that also need to be addressed. The options available for presentation format for the data also need to be decided.
4. Integration of data: Data integration is the process of combining data from various sources into a single, unified form. This step include creation of data network sources, a master server and users accessing the data from master server. Data integration eventually enables the analytics tools to produce effective, actionable business intelligence.
5. Data reporting: Data reporting stage involves translating the data into a consumable format to make it accessible by the users. For example, for a business firm, they should be able to provide summarized financial information e.g. revenue, net profit etc. The objective is, a user, who wants to understand the financial position of the company should get the relevant and accurate information.
6. Data utilization: At this ultimate step, data is being utilized to back corporate activities and enhance operational efficiencies and productivity for the growth of business. This makes the corporate decision
making really ‘data driven’.
The quality information should lead to quality decisions. With the help of well curated and reported data, the decision makers should be able to add higher-value business insights leading to better strategic decision making.
In a sense, a judicious use of data analytics is essential for implementation of ‘lean finance’, which implies optimized finance processes with reduced cost and increased speed, flexibility and quality. By transforming the information into a process for quality decision making, the firm should achieve the following abilities:
(i) Logical understanding of a wide-ranging structured and unstructured data and put on that information to corporate planning, budgeting and forecasting and decision support
(ii) Predict outcomes more effectively compared to conventional forecasting techniques based on historical financial reports
(iii) Real time spotting of emerging opportunities and also capability gaps.
(iv) Making strategies for responding to uncertain events like market volatility and ‘black swan’ events through simulation.
(v) Diagnose, filter and excerpt value from financial and operational information for making better business decisions
(vi) Recognize viable advantages to service customers in a better manner
(vii) Identifying possible fraud possibilities on the basis of data analytics.
(viii) Building impressive and useful dashboards to measure and demonstrate success leading to effective strategies.
The aim of a data driven business organization is develop a business intelligence (BI) system that is not only focused on efficient delivery of information but also provide accurate strategic insight into the operational and financial system. This impacts the organizational capabilities in a positive manner. This makes the organization resilient to market pressures and create competitive advantages by serving customers in better way by using data and predictive analytics.
While data analytics is an important tool for decision making, managers should never take an important analysis at face value. A deeper understanding of hidden insights that lie underneath the surface of the data set need to be explored, and what appears on the surface should be looked with some scepticism.
The emergence of new data analytics tools and techniques in financial environment allows the accounting and finance professionals to gain unique insights into the data, but at the same time creating very unique challenges while exercising scepticism. As the availability of data is bigger now, analysts and auditors not only getting more information, but also is facing challenges about managing and investigating red flags.
One major concern about the use of data analytics is the likelihood of false positives, i.e. the data may identify few potential anomalies that could be later identified as reasonable and explained variation of data.Studies show that the frequency of false positives increase proportionately with the size and complexity of data. Few studies also show that analysts face problems while determining outliers using data analytics tools.Professional scepticism is an important focus area for practitioners, researchers, regulators and standard setters. At the same time, professional scepticism may result into additional costs e.g. strained client relationships, and budget coverages. Under such circumstances, it is important to identify and understand conditions in which the finance and audit professionals should apply professional scepticism. There is a requirement to keep a fine balance between costly scepticism and underutilizing data analytics to keep the cost under control.
Data analytics can help in decision making process and make an impact. However, this empowerment for business also comes with challenges. The question is how the business organizations can ethically collect, store and use data? And what rights need to be upheld? Below we will discuss five guiding principles in this regard. Data ethics addresses the moral obligations of gathering, protecting and using personally identifiable information. In present days, it is a major concern for analysts, managers and data professionals. The five basic principles of data ethics that a business organization should follow are:
(i) Regarding ownership: The first principle is that ownership of any personal information belongs to the person. It is unlawful and unethical to collect someone’s personal data without their consent. The consent may be obtained through digital privacy policies or signed agreements or by asking the users to agree with terms and conditions. It is always advisable to ask for permission beforehand to avoid future legal and ethical complications. In case of financial data, some data may be sensitive in nature. Prior permission must be obtained before using the financial data for further analysis.
(ii) Regarding transparency: Maintaining transparency is important while gathering data. The objective with which the company is collecting user’s data should be known to the user. For example is the company is using cookies to track the online behaviour of the user, it should be mentioned to the user through a written policy that cookies would be used for tracking user’s online behaviour and the collected data will be stored in a secure database to train an algorithm to enhance user experience. After reading the policy, the user may decide to accept or not to accept the policy. Similarly, while collecting the financial data from clients, it should be clearly mentioned that for which purpose the data should be used.
(iii) Regarding privacy: As the user may allow to collect, store and analyze the personally identifiable information (PII), that does not imply it should be made publicly available. For companies, it is mandatory to publish some financial information to public e.g. through annual reports. However, there may be many confidential information, which if falls on a wrong hand may create problems and financial loss. To protect privacy of data, a data security process should be in place. This may include file encryption and dual authentication password etc. The possibility of breach of data privacy may also be done through deidentifying a dataset.
(iv) Regarding intention: The intension of data analysis should never be making profits out of others weaknesses or for hurting others. Collecting data which is unnecessary for analysis should be avoided and it’s unethical.
(v) Regarding outcomes: In some cases, even if the intentions are good, the result of data analysis may inadvertently hurt the clients and data providers. This is called disparate impact, which is unethical.
Solved Case 1
Mr. Arjun is working as data analyst with Manoj Enterprises Limited. He was invited by an educational institute to deliver a lecture on data analysis. He was told that the participants would be fresh graduates, who would like get a glimpse of the emerging field of ‘data analysis’. He was planning for the lecture and is thinking of the concepts to be covered during the lecture.In your opinion, which are the fundamental concepts that Arjun should cover in his lecture
A. Theoretical Questions:
Multiple Choice Questions:
1. Numerical data may be expressed as
(a) In the form of text
(b) In the form of numbers
(c) In the form of images
(d) All of the above
answer: b
Choice "B" is correct as Numerical data may be expressed as In the form of numbers
Numerical data is typically expressed in the form of numbers. This can include data such as counts, measurements, percentages, ratings, and so on.
The use of numerical data can be particularly useful for analysis, as it can be easily quantified and manipulated to identify patterns, trends, and relationships. In addition, numerical data can be used to make predictions or forecasts based on historical trends and patterns.
Overall, numerical data is an important type of data that is commonly used in a wide range of fields, including science, business, engineering, and finance, among others.
2. The descriptive data may be deciphered as
(a) May be deciphered in the form of qualitative information
(b) May be deciphered in the form of quantitative information
(c) May be deciphered in the form of information from informal sources
(d) All of the above
answer: a
3. Data represented in the form of picture is termed as
(a) Graphic data
(b) Qualitative data
(c) Quantitative data
(d) All of the above
answer: a
Choice "A" is correct as Data represented in the form of picture is termed as Graphic data.
The term "graphic data" can have different meanings in different contexts, but generally speaking, it refers to data that is represented visually through the use of graphics, charts, or other types of images.
Graphical data can be useful for displaying complex information in a clear and concise manner, making it easier for viewers to understand and interpret. For example, in a business setting, graphical data might be used to display sales figures over time, customer demographics, or other key metrics. In a scientific setting, graphical data might be used to display experimental results or other research findings.
Overall, the use of graphical data can be a powerful tool for presenting and analyzing data in a clear and effective way.
4. Which of the following is/are the reason for digitization
(a) Helps in work processing
(b) Requires less physical storage space
(c) Digitized records may be accessed by more than one person simultaneously
(d) All of the above
answer: b
There are many arguments that favour digitization of records. Some of them are mentioned below:
5. To make the data turn into user friendly information, it should go one/more of following core steps
(a) Collection of data
(b) Organising the data
(c) Data processing
(d) All of the above
answer: d
Choice "D" is correct as --
to make the data turn into user friendly information, Collection of data, Organising the data & Data processing are core steps.
To make the data turn into user friendly information, it should go through six core steps:
1.Improves classification and indexing for documents, this helps in retrieval of the records. answer: T
2. Data is not a source of information answer: F
3. One of largest digitization project taken up in India is ‘Unique Identification number’ (UID) or ‘Aadhar’answer:T
4.when these ‘information’ is used for solving a problem, we may it’s the use of knowledge answer:T
5. Any data expressed as a number is a numerical data answer: T
Fill in the blanks
1.There are primarily Two basic objectives of digitization.
2.By the term Quantitative data, we mean the data expressed in numbers.
3.Daily stock price of Tata Steel Ltd is an example of numerical data.
4.Data is a Source of information.
5.When these ‘information’ is used for solving a problem, we may it’s the use of knowledge
Short essay type questions
1.Define the term ‘descriptive data’ with examples.
Answer:
2.Discuss the difference between ordinal scale and ratio scale.
Answer:
3.Discuss the relationship between data, information and knowledge
Answer:
4.One major concern about the use of data analytics is the likelihood of false positives’ – briefly discuss
Answer:
-Essay type questions
1 Discuss the five basic principles of data ethics that a business organization should follow
Answer:
2.‘The quality information should lead to quality decisions’ – Discuss
Answer:
3. Discuss the six core steps that may turn the data into user friendly information.
Answer:
4.Discuss the six phases that comprise the entire process of digitization.
Answer:
5.Why we digitize the data?
Answer:-
Unsolved Case
1. Ram Kumar is the head data scientist of Anjana Ltd. For the last few weeks, he is working along with his team for extracting information from a huge pile of data collected over time. His team members are working day and night for collecting and cleaning the data. He has to make a presentation before the senior management of the company to explain the findings. Discuss the important steps, he need to take care of to transform raw data into useful knowledge
References:
● Data-driven business transformation. KPMG International
● Davy Cielen, Arno D B Meysman, and Mohamed Ali. Introducing Data Science. Manning Publications Co USA
● www.finance.yahoo.com
● www.google.com
● Data, Information and Knowledge. Cambridge International
● Data Analytics and Skeptical Actions: The Countervailing Effects of False Positives and Consistent Rewards for SkepticismBy Dereck Barr-Pulliam, Joseph Brazel, Jennifer McCallen and Kimberly Walker
● Annual Report of Hindustan Unilever Limited. (2021-22)
● www.uidai.gov.in
● Bandi, S., Angadi, M. and Shivarama, J. Best. Practices in digitization: Planning and workflow processes. In Proceedings of the Emerging Technologies and Future of Libraries: Issues and Challenges
● Finance’s Key Role in Building the Data-Driven Enterprise. Harvard Business Review Analytic Services
● How to Embrace Data Analytics to Be Successful. Institute of Management Accountants. USA
● The Data Analytics Implementation Journey in Business and Finance. Institute of Management Accountants. USA
● Principles of data ethics in business. Business Insights. Harvard Business School.
Ruchika Ma'am has been a meritorious student throughout her student life. She is one of those who did not study from exam point of view or out of fear but because of the fact that she JUST LOVED STUDYING. When she says - love what you study, it has a deeper meaning.
She believes - "When you study, you get wise, you obtain knowledge. A knowledge that helps you in real life, in solving problems, finding opportunities. Implement what you study". She has a huge affinity for the Law Subject in particular and always encourages student to - "STUDY FROM THE BARE ACT, MAKE YOUR OWN INTERPRETATIONS". A rare practice that you will find in her video lectures as well.
She specializes in theory subjects - Law and Auditing.
Yash Sir (As students call him fondly) is not a teacher per se. He is a story teller who specializes in simplifying things, connecting the dots and building a story behind everything he teaches. A firm believer of Real Teaching, according to him - "Real Teaching is not teaching standard methods but giving the power to students to develop his own methods".
He cleared his CA Finals in May 2011 and has been into teaching since. He started teaching CA, CS, 11th, 12th, B.Com, M.Com students in an offline mode until 2016 when Konceptca was launched. One of the pioneers in Online Education, he believes in providing a learning experience which is NEAT, SMOOTH and AFFORDABLE.
He specializes in practical subjects – Accounting, Costing, Taxation, Financial Management. With over 12 years of teaching experience (Online as well as Offline), he SURELY KNOWS IT ALL.