Universitat Internacional de Catalunya

MÓDULO 3.2: Métodos Estadísticos y Data Mining

MÓDULO 3.2: Métodos Estadísticos y Data Mining
5
13946
1
First semester
OB
Main language of instruction: Spanish

Other languages of instruction: Catalan, English

Teaching staff


Faculty:

Imanol Morata (imorata@uic.es)

José Antonio Esteban (jaesteban@uic.es)

David Roche (droche@uic.es)

Introduction

This subject consists of a review part of basic statistics and another part dedicated to learning some of the classic techniques related to obtaining knowledge and making decisions. Statistics is an essential and basic component in order to understand techniques more typical of data science and the extraction of patterns, rules and knowledge. In summary, in this subject basic statistical aspects will be retaken along with some more specific technique related to data science

Pre-course requirements

To take this course, certain knowledge of basic statistics and programming in R and Python is assumed

Objectives

The objectives of this course refer to the use of statistics and certain algorithmic techniques to complete the basic knowledge necessary for training in data science. In summary, aspects such as hypothesis testing, linear regression, main components and evolutionary computation will be addressed. There will also be a session on the life cycle of data science projects and on machine learning techniques

 

Competences/Learning outcomes of the degree programme

  • 31 - To develop the ability to identify and interpret numerical data.
  • 36 - To interpret quantitative and qualitative data and apply mathematical and statistical tools to business processes.
  • 64 - To be able to plan and organise one's work.
  • 32 - To acquire problem solving skills based on quantitative and qualitative information.
  • 40 - To be able to choose statistical methods appropriate to the object of analysis.
  • 43 - To acquire skills for using statistical software.
  • 50 - To acquire the ability to relate concepts, analyse and synthesise.
  • 51 - To develop decision making skills.
  • 53 - To acquire the skills necessary to learn autonomously.
  • 54 - To be able to express one’s ideas and formulate arguments in a logical and coherent way, both verbally and in writing.
  • 56 - To be able to create arguments which are conducive to critical and self-critical thinking.
  • 65 - To acquire the ability to put knowledge into practice.
  • 66 - To be able to retrieve and manage information.

Learning outcomes of the subject

At the end of this subject the student will be able to face quantitative and qualitative problems and apply classical statistical techniques and some algorithms typical of data science to obtain information from the data to support decision-making

Syllabus

Contents of the subject:   1. Review programming elements 2. Hypothesis test for the difference of means to. Parametric two means and more than two means b. No Parametric two means and more than two means 3. Linear regression to. Simple model b. Multiple model 4. Principal component analysis 5. Evolutionary computing 6. Business sessions on the use of machine learning techniques

Teaching and learning activities

In person



The learning technique of this subject is “learning by doing” so practical cases will be applied to each theoretical concept that is developed in the different sessions and for the different languages. The objective is always to bring the student closer to the reality of their profession where they will have to apply the theoretical-practical knowledge learned throughout the course. Most sessions are structured as follows:   1. Presentation of the theoretical summary by the teaching staff 2. Example application by teachers 3. Presentation of problems and solution by the students 4. Joint problem solving 5. Simulated case study or with real data

6. Practical work to do at home with the intention of assimilating the concepts learned in the session


Evaluation systems and criteria

In person



The evaluation of this subject will be obtained with the equitable weighting of all the deliveries made throughout the course. The final mark is the mark of the continuous evaluation.

Bibliography and resources

-Statistics for Business and Economics, Global Edition. Paul Newbold, William Carlson, Betty Thorne. 2019

- An Introduction to Statistical Learning: with Applications in R. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. Springer Publishing Company, Incorporated.

- R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics. J D Long y Paul Teetor. 2019