Data Management and Business Intelligence
Database industry has developed in the last twenty years from a niche market to the main IT constituent of most organizations, small or big. The explosion in the volume of data collected in today’s applications dictates novel architectures, query paradigms and storage models. The first part of the course will cover traditional topics, such as: data modeling, design and architecture, relational systems, query languages, SQL, performance issues, query processing, transactions, web, distributed databases. It will also introduce the challenges modern data management systems face, such as stream data and cloud computing. Data warehousing, decision support, OLAP and data mining, what many people collectively call Business Intelligence (BI), has reached a maturity height with abundance of systems, platforms and methods. It has evolved from a niche area for large and highly sophisticated corporations to an essential component of any modern business entity or institution. The second part of the course will review basic BI concepts, such as: design, implementing and modeling of data warehouses, star schemas, data cubes, OLAP, tools & systems and design methodologies. It will also cover new trends in BI such as main-memory BI and column-oriented systems.
Information Systems & Business Process Management
This course introduces the notion of information systems (I.S.) used in enterprises, links them with business analytics (B.A.) and analyses business processes (B.P.) as the fundamental element of modern enterprises and the management of their performance. It consists of four parts:
- Information Systems for Enterprises: Basic principles and functions of I.S. Presentation of IS categories and applications in enterprises. Strategic advantage and I.S. planning. Managing I.S and I.S resources in organizations.
- Business Analytics in Enterprise I.S.: Developing new insights and understanding of business performance through I.S. Achieving the basic types of analytics through I.S.: decisive, descriptive, predictive, prescriptive. Examples of B.A. systems for marketing, retail sales, supply-chain, financial services, telecommunications, e-commerce etc.
- Business Process Management: Types of B.P and their function in the enterprise. BP process modelling techniques. Application of IT tools for BP process modelling and management. Comprehension of BP architecture. Specification of requirements for new IS and infrastructure.
- Principles of BP Performance Management: Process performance metrics and practical case examples of enterprise and inter-organizational systems: ERP, CRM, MIS, e-commerce and e-government. Process management frameworks and the balanced scorecard approach.
Large Scale Optimization
This course introduces advanced optimization tools and techniques with the main emphasis being on the application of computational intelligence algorithms to different large scale optimization problems and cases which arise in business and industry, such as transportation, logistics, production and services. On completion of this course, students should be able to: broaden their exposure to computational methodologies; analyze and design effective computational intelligence algorithms for complex business problems, and; provide examples and cases of how the computational intelligence algorithms can be used to solve real-life problems. The course material includes the following thematic areas: construction and local search algorithms; simulated annealing algorithms; tabu search algorithms; ant colony optimization; evolutionary algorithms.
Mining Big Datasets
Understanding of big data can help improve decision making in big enterprises. Existing techniques are dwarfed by the complexity, variety, scale and dynamics of big data. In this course we will first identify the major challenges in mining big datasets in modern applications of interest. We will then overview emerging computational platforms in the area of large-scale distributed processing and discuss recent algorithmic results that can help attack big data mining problems.
Statistics for Business Analytics I
This course aims at introducing basic concepts of probability and statistics useful in a great extend in several other courses. The course assumes that everybody has some basic idea about statistics, so the focus will be given to clarify the usefulness and the importance of the approaches and how Statistics can considerably help the decision making process. To this direction a brief introduction to basic principles of probability theory will be given and their connection to problems in Statistics. Basic statistical ideas for descriptive statistics and data visualization will be discussed together with problems of statistical inference like estimation and hypothesis testing. Regression type models will be discussed, including simple, multiple, logistic regression and a brief introduction to generalized linear models. Issues of statistical processing control will be provided. The Bayesian approach in statistical modeling offering certain possibilities with huge datasets will be introduced and worked. All material will be focused mainly on applications but the basic statistical insight will be discussed in depth. Also focus on problems and their modern solutions with big data will be discussed.
Innovation and Entrepreneurship (short course)
The growth of electronic channels over the last decade paired with developments in social media, Web 2.0 and crowd sourcing, sensor networks and ubiquitous computing has led to an explosion of data. Due to the speed of developments, most of these data remain unexploited and the need to derive meaningful information and knowledge out of them has increased to an unprecedented degree. This fact has created a new landscape for innovation and entrepreneurship, opening up new opportunities for the development of new tools, services and offerings that respond to this need. The objective of this course is to provide the theoretical and practical basis that will allow students to identify business opportunities and innovation areas associated with the exploitation of big data and design innovative services in response to the identified business needs. Moreover, the course will provide guidelines in the area of business planning to support an entrepreneurial mindset. A series of case studies will be discussed under this perspective, while students will have the opportunity to propose their own ideas exploiting big data analytics, evaluate alternative business models and practically develop the respective business plans.
Analytics Practicum I
SAS Tools (12 hours)
Special topics such as visual analytics, crowdsourcing, location-based services, other systems and tools (18 hours)
Big Data Systems
The enormous size of today’s data sets and the specific requirements of modern applications, necessitated the growth of a new generation of data management systems, where the emphasis is put on distributed and fault-tolerant processing. New programming paradigms have evolved, novel systems and tools have been developed and an abundance of startups offering data management and analysis solutions appeared. Part of this course will cover MapReduce and NoSQL systems. Topics include: MapReduce programming, Hadoop, Pig and Hive, developing applications in Amazon’s EC2 environment, key-value stores such as Redis, document stores such as MongoDB and graph databases such as Neo4j. In addition, engineering software that can efficiently handle large data sets requires specialized skills and familiarity with sophisticated tools. Part of the course will cover an overview of general purpose tools and describe how cloud infrastructures can be configured and used for large data processing. Then a systematic method for locating and addressing performance issues will be presented. For the cases where specialized processing is required, we will examine low-level techniques, like memory mapping and copy-on-write. Finally, we will see how visualization of big data can be performed and automated.
Statistics for Business Analytics II
This course aims at presenting state to the art methods used with real data for eliciting important information for decision making. The course will start with basic principles of sampling methodologies and their importance. Then dimension reduction methods like Principal Components Analysis and Factor Analysis and their variants will be discussed. Supervised and Unsupervised Statistical learning methods will follow. For unsupervised methods different types of clustering will be discussed, like partition methods, hierarchical methods and model based methods while problems with large data sets will be illustrated. Supervised learning methods like discriminant analysis, decisions trees, kernel based approaches, nearest neighbors and other classification methods will be also presented. Problems of variables and model selection will be discussed. Finally a brief introduction to Predictive Analytics will be given to elaborate the difference and the importance of predictive approaches in Business analytics. For all topics several examples will be used using R and their libraries.
Social Network Analysis (short course)
This course will introduce students to social network analytics (SNA) and their instrumental value for businesses and the society. SNA encompasses techniques and methods for analyzing the constant flow of information over online social networks (e.g. Facebook posts, twitter feeds, foursquare check-ins) aiming to identify, sometimes even in real-time, patterns of information propagation that are of interest to the analyst. The course will provide students with an in-depth understanding of the opportunities, challenges and threats arising by online social media as far as businesses and the society at large are concerned. It will use case-based teaching and discussions to introduce students to the social and ethical issues that often arise by mining the publicly available information across online social networks for business purposes and/or other types of analyses. Finally, students will be introduced to the concepts of the wisdom of the crowds and social learning, investigating the conditions under which opinion convergence (asymptotic learning) or herding may occur in online social networks.
Machine Learning and Content Analytics (short course)
This course is concerned with extracting useful information from unstructured big data, mostly data in the form of text and speech. The course will introduce core concepts, models, and algorithms from machine learning, natural language processing, and speech processing that can be used to recognize speech, and normalize, classify, cluster, tag, parse, disambiguate, and extract information from texts and spoken utterances. Several application areas will be considered, including filtering e-mails and social media messages, summarizing opinions and performing sentiment analysis (e.g., for particular products) in social media or discussion fora, monitoring spoken dialogues in call centers (e.g., to check for compliance with protocols), populating databases with information extracted from news feeds (e.g., company mergers and acquisitions), finding answers to scientific questions in the research literature. The students will have the opportunity to learn how to use existing tools (e.g., machine learning, speech recognition, and natural language processing toolkits) by applying them to realistic datasets. Key concepts and applications of multimodal content analytics will also be covered if time permits.
Business Analytics Use Cases
This course is offered in collaboration with the Master’s program industrial partners and will cover five or six real analytics implementations on a wide range of vertical sectors, such as finance, marketing, health, energy, public sector, supply-chain, transportation, etc. The goal is to present analytics case studies, covering the stages of the analysis pipeline, such as requirements specification and problem definition, data collection, transformation and integration, model building and visualization.
Business and Privacy Issues in Data Analysis
Introduction into basic terms/notions: privacy, data protection, confidentiality, security. Information: regulation and governance. Theoretical and regulatory approaches in Greece, EU and abroad. The notion of personal data. Regulation of the use of personal data in EU/ Greece. Analysis of the main concepts, approaches and requirements of General Data Protection Regulation (legal grounds, principles, rights of data subjects). Data Protection by Design and Data Protection Impact Assessment and BA. Big Data Analytics: characteristics of big data analytics and techno-economical context and impact on personal data governance. Big Data Analytics and Data protection principles (purpose limitation, data minimization). Profiling and Decision making. Artificial Intelligence/ Machine learning and processing of personal data. Accountability, transparency and explainability of AI (applications). The issues concerning discrimination and impact of predicting/decision making. Ethics and Business/ Data Analytics.
Data Governance, Information Life Cycle Manager, policies, metadata management.
The concepts of pseudo-anonymization, anonymization and security of information systems and how each concept contributes to the protection of personal data. Anonymization techniques and techniques for achieving k-anonymity, l-diversity, differential privacy, etc. Use of the Amnesia tool to anonymize data.
From creating an evidence-based health care system to building sustainable cities, open data is increasingly a critical infrastructure in many initiatives to make the world a better place. In the coming years, the collection, analysis, and use of massive amounts of data will have the potential to generate enormous social and economic benefits, but successfully capitalizing on these opportunities will require public policies designed to allow data-driven innovation in science and business to flourish.
Analytics Practicum II
Cloud Analytics (12 hours)
Special topics such as visual analytics, crowdsourcing, location-based services, other tools and systems (18 hours)
Advanced Topics in Statistics
Time series models, basic principles, indices and problems. Basic models, autoregressive models, moving average models, ARIMA models.Estimation and prediction-forecast. Models for financial data (ARCH, GARCH etc). Sampling methods and their use in practical problems. Simulation and Statistical inference using computers. Statistical models for unstructured data, recent methods for statistics with big data. Dimension Reduction techniques.
Advanced Topics in Data Engineering
Statistical analysis and machine learning requires clean and transformed data, structured into a specific format. However, data exists in various systems, models and formats. Systems vary from state-of-the-art (e.g. Hadoop) to legacy (e.g. IBM mainframes and Cobol). Data can be stored as relations, but also as JSON/XML documents, graphs, etc. Finally, data can be structured or unstructured, such as text and images. Topics in this course will include data extraction, data transformation, data integration, data virtualization, entity matching, and others. Students will have to use systems and tools on these topics (e.g. SSIS, Pentaho, Denodo, etc.)