Project List

The research project will be completed in groups (3 students), and a written proposal will be due by each group on Thursday of the second week. The objective is to design and implement a given set of requirements. Some possible project ideas are described below. Original ideas are welcome, but should be approved by the instructor before the proposal deadline.

  1. Cloud Data Management

    -Database as a Service, Multi-tenancy
    -Elasticity and Scalability for Cloud Data Management Systems
    -New Protocols, Service Interfaces and Data Models for Cloud Databases
    -Polyglot Persistence, NoSQL, Schemaless Data Modeling, Integration
    -Data-Centric Web-Services, RESTful Data Services
    -Database Architectures for Mobile and Web Clients
    -Content Delivery Networks, Caching, Load-Balancing, Web-scale workloads
    -Virtualization for Cloud databases, Storage Structures and Indexing
    -Frameworks and Systems for Parallel and Distributed Computing
    -Scalable Machine Learning, Analytics and Data Science
    -Resource and Workload Management in Cloud Databases
    -Tunable and Eventual Consistency, Latency
    -High Availability, Reliability, Failover
    -Transactional Models for Cloud Databases
    -Query Languages and Processing, Programming Models
    -Consistency, Replication and Partitioning
    -CAP, Data Structures and Algorithms for Eventually Consistent Stores

  2. Cloud Security and Forensics

    Cloud computing offers utility oriented Information and Communications Technology (ICT) services to users all over the world. The evolution of cloud computing is driving the design of datacentres by architecting them as networks of virtual services; this enables users to access and run applications from anywhere in the world. As the prevalence and usage of networked cloud computer systems increases, logically the security of these systems can pose significant security concerns and the likelihood of these systems being used for criminal behaviour also increases. Thus, this new computing evolution has a direct effect on, and creates challenges for, cybersecurity and digital forensic practitioners.
    The field of digital forensics has grown rapidly over the last decade due to the rise of the Internet and associated crimes; however while the theory is well established, the practical application of the discipline is still new and developing. Law enforcement agenciescan no longer rely on traditional digital forensic methods of data acquisition through device seizure to gather relevant evidence pertaining to an investigation from cloud sources. Using traditional digital forensic methods will lead to the loss or overlooking ofvaluable evidential material hostedon cloud-based infrastructures.Cloud computing andits impact on digitalforensics will continue to grow and traditional digital forensics methods are inadequate for cloud forensic investigations. Topics include:
    Cloud Forensics and e-Discovery
    Intrusion and Attack Readiness
    Cloud Cryptography
    Datacentre and Data Security
    Big Data Forensics
    Security and Privacy for Cloud Computing
    Trust and Policy Management
    Case Studies of Cloud-Β­-based Attacks
    Cloud Data Recovery
    Private Cloud Security
    Network Security and Forensics
    Mobile Cloud Security and Forensics
    Forensics GIS
  3. Elastic Data Management in Cloud Systems

    The adoption of cloud computing is increasingly growing since today's applications continuously produce huge volumes of heterogeneous data that are accessed and shared in large scale environments. The elastic management of such data is one of the most important research areas in cloud computing. It raises new problems regarding modeling, storage, processing, optimization, cost model, replication, data privacy and security, and monitoring services. Topics include:
    - Cloud infrastructures for big data management
    - Support for scalable and elastic services
    - Virtualization for cloud databases
    - Service level agreement management for cloud systems
    - Business models and economic pricing policies
    - Cloud data management for mobile applications
    - Data privacy and security
    - Transactional model for cloud databases
    - High availability, reliability and fault tolerance
    - Dynamic data placement issues
    - Service discovery
    - Query processing, indexing and optimization in cloud computing systems
    - Cost models
    - Replication and caching
    - Data storage
    - Partitioning and load balancing
    - Database as a service
    - Advances in NoSQL databases
    - Multi-tenancy and workload isolation
    - Languages for massively parallel query execution
    - Performance evaluation and benchmarking
    - Cloud applications and experiences
  4. Crowd Sourcing

    In the recent period, the potential of crowdsourcing in business and social context, enterprise resource planning software, and many other fields are visualized to systematically design and implement the Information Technology-based ideas and to deploy in many newer intelligent applications. Crowdsourcing deployment not only addresses the access to systems and connections, but adds discussions and research in the problem-specific skills and technologies. Crowdsourcing as an online distributed problem-solving and production model has potential to pull creators and innovators for creating a new generation of collective intelligence. Some themes in crowdsourcing are:
    Open innovations and open source
    Software models
    Theory development and designs
    Data Mining
    Enterprise resource planning
    Business to Business
    Web technologies etc
  5. Internet and Social media for Environmental Monitoring

    The advancements in digital technologies and the high penetration of Internet have facilitated the sharing of environmental information, such as meteorological measurements and observations of the natural surroundings. Since the analysis of environmental information is critical both for human activities (e.g., agriculture, deforestation) and for the sustainability of the planet (e.g., nature conservation, green living, eco-driving), it is of great importance to develop techniques for the retrieval and aggregation of environmental information that is available over the Internet. The exploitation of user-generated content is of great importance, as, despite being of low quality in many cases, they could provide important information regarding areas that are not monitored by existing stations.
    Research topics of interest:
    - Indexing and retrieval of environmental data from the Web
    - Analysis of multimedia environmental user-generated content
    - Computer vision for environmental video and image processing
    - Multimedia analysis for weather phenomena and natural disasters understanding
    - Discovery of environmental multimedia information on the Web
    - Content extraction from environmental data on the Web
    - Analysis of public interest in environmental information via social media
    - Aggregation of environmental data from multiple sources
    - Data mining of Web data for environmental applications
    - Social network analysis of environmental communities and discussions on social media
    - Personalised services based on environmental information
    - Fusion of multimedia environmental information
    - User Interfaces, presentation and visualization methods for environmental data
    - Semantic Web approaches for environmental data
    - Decision support services and reasoning of multimedia environmental data
    - Crowdsourcing techniques and citizen science for environmental data mining
    - Sensor-based environmental monitoring
  6. Data Privacy Management

    Organizations are increasingly concerned about the privacy of information that they manage (as witnessed, for example, by lawsuits filed against organizations for violating the privacy of customer's data). Thus, the first management of privacy-sensitive information is very critical and important for every organization. This poses several challenging problems, such as how to translate the high-level business goals into system-level privacy policies, administration of privacy-sensitive data, privacy preserving data integration and engineering, privacy preserving access control mechanisms, information-oriented security, and query execution on privacy-sensitive data for partial answers. Topics of interest include:
    * Privacy Information Management
    * Privacy Policy-Based Infrastructures and Architectures
    * Privacy-Oriented Access Control Languages and Models
    * Privacy in Trust Management
    * Privacy in Digital Currencies
    * Privacy Risk Assessment and Assurance
    * Privacy Services
    * Cryptography and Cryptanalysis
    * Privacy Policy Analysis
    * Query Execution over Privacy-Sensitive Data
    * Privacy Preserving Data Mining
    * Privacy for IntegrityBased Computing
    * Privacy Monitoring and Auditing
    * Privacy in Social Networks
    * Privacy in Ambient Intelligence (AmI) Applications
    * Individual Privacy vs. Corporate/National Security
    * Privacy in Computer Networks
    * Privacy and RFIDs
    * Privacy and Big Data
    * Privacy in Sensor Networks
    * Privacy and Security Management in Cloud Computing
    * Privacy and Security Management in the IoT
    * Privacy and Security Management in Pervasive Computing
  7. Risks and Security of Internet and Systems

    Internet has become essential for the exchange of information between user groups and organizations from different backgrounds and with different needs and objectives. These users are exposed to increasing risks regarding security and privacy, due to the development of more and more sophisticated online attacks, the growth of Cyber Crime, etc. Attackers nowadays do not lack motivation and they are more and more experienced. To make matters worse, for performing attacks have become easily accessible. Moreover, the increasing complexity as well as the immaturity of new technologies such as pervasive, mobile and wireless devices and networks, raise new security challenges. In this context, new security mechanisms and techniques should be deployed to achieve an assurance level acceptable for critical domains such as transportation, health, defence, banking, critical infrastructures, embedded systems and networks, avionics systems, etc. Topics include:
    • Analysis and management of risk
    • Attacks and defenses
    • Attack data acquisition and network monitoring
    • Cryptography, Biometrics, Watermarking
    • Dependability and fault tolerance of Internet applications
    • Distributed systems security
    • Embedded system security
    • Empirical methods for security and risk evaluation
    • Hardware-based security and Physical security
    • Intrusion detection and Prevention systems
    • Organizational, ethical and legal issues
    • Privacy protection and anonymization
    • Risk-aware access and usage control
    • Security and risk assessment
    • Security and risks metrics
    • Security and dependability of operating systems
    • Security and safety of critical infrastructures
    • Security and privacy of peer-to-peer system
    • Security and privacy of wireless networks
    • Security models and security policies
    • Security of new generation networks, security of VoIP and multimedia
    • Security of e-commerce, electronic voting and database systems
    • Security of social networks
    • Security des systθmes industriels
    • Smartphone security and privacy
    • Traceability, metrology and forensics
    • Trust management
    • Use of smart cards and personal devices for Internet applications
    • Web and cloud security
  8. Security and Trust Management

    Topics include:
    * Access control
    * Mobile security
    * Security and trust in the Internet of Things
    * Anonymity
    * Networked systems security
    * Security and trust in pervasive computing
    * Applied cryptography
    * Operating systems security
    * Security and trust in services
    * Authentication
    * Privacy
    * Security and trust in social networks
    * Complex systems security
    * Security and trust metrics
    * Social implications of security and trust
    * Data and application security
    * Security and trust policies
    * Trust assessment and negotiation
    * Data protection
    * Security and trust management architectures
    * Trust in mobile code
    * Data/system integrity
    * Security and trust for big data
    * Trust models
    * Digital right management
    * Security and trust in cloud environments
    * Trust management policies
    * Economics of security and privacy
    * Security and trust in content delivery networks
    * Trust and reputation systems
    * Formal methods for security and trust
    * Security and trust in crowdsourcing
    * Trusted platforms
    * Identity management
    * Security and trust in grid computing
    * Trustworthy systems and user devices
    * Legal and ethical issues
  9. Trust, Security and Privacy for Big Data

    The proliferation of new technologies such as Internet of Things and cloud computing calls for innovative ideas to retrieve, filter, and integrate data from a large number of diverse data sources. Big Data is an emerging paradigm applied to datasets whose volume/velocity/variability is beyond the ability of commonly used software tools to manage and process the data within a tolerable period of time. More importantly, Big Data has to be of high value, and should be protected in an efficient way. Since Big Data involves a huge amount of data that is of high-dimensionality and inter-linkage, existing trust, security, and privacy measures for traditional databases and infrastructures cannot satisfy its requirements. Topics include:
    Trust
    (1) Trust semantics, metrics, and models for Big Data
    (2) Trust management and evaluation for Big Data
    (3) Trusted systems, software, and applications for Big Data
    (4) Trusted platform implementation technologies for Big Data
    (5) Information quality/trustworthiness for Big Data
    (6) Provenance of content for Big Data
    (7) Trustworthiness of ratings/recommender systems for Big Data
    Security
    (1) Security model and architecture for Big Data
    (2) Data mining security for Big Data
    (3) Software and system security for Big Data
    (4) Intrusion detection for Gigabit Networks
    (5) Cryptography and Big Data
    (6) Visualizing large scale security data
    (7) Threat detection using Big Data analytics
    (8) Human computer interaction challenges for Big Data security
    (9) Data protection, integrity standards and policies
    (10) Security and legislative impacts for Big Data
    (11) Managing user access for Big Data
    (12) Secure quantum communications
    Privacy
    (1) Privacy in Big Data applications and services
    (2) Privacy in Big Data end-point input validation and filtering
    (3) Privacy in Big Data integration and transformation
    (4) Privacy in parallel and distributed computation
    (5) Privacy in Big Data storage management
    (6) Privacy in Big Data access control mechanisms
    (7) Privacy in Big Data mining and analytics
    (8) Privacy in Big Data sharing and visualization
    (9) Big Data privacy policies and standards
  10. Data Science and Big Data Analytics

    Due to the rapid development of IT technology including Internet, Cloud Computing, Mobile Computing, and Internet of Things, as well as the consequent decrease of cost on collecting and storing data, big data has been generated from almost every industry and sector as well as governmental department. The volume of big data often grows exponentially or even in rates that overwhelm the well-known Moore‘―s Law. Meanwhile, big data has been extended from traditional structured data into semi-structured and completely unstructured data of various types, such as text, image, audio, video, click streams, log files, etc. * Acquisition, representation, indexing, storage, and management of big data
    * Processing, pre-processing, and post-processing of big data
    * Models, algorithms, and methods for big data mining and understanding
    * Knowledge discovery and semantic-based mining from big data
    * Visualizing analytics and organization for big data
    * Context data mining from big Web data
    * Social computing over big Web data
    * Industrial and scientific applications of big data
    * Tools for big data analytics
  11. Integration of P2P Data

    Different architectures and tools for data integration have been been investigated by the database research community. One approach that has been successfully used to develop integrated systems includes materialization in data warehouses. Another approach uses mediation based approach in which a mediated schema specifies semantic mappings between the sources and mediated the schema. A third approach is to use peer to peer architecture that can facilitate ad hoc decentralizated sharing and administration of data and defining of semantic relationships. Using this architecture, every user of the system can contribute new data by relating it to existing concepts and schemas, define new schemas that others can use as frames of reference for their queries, or establish new relationships between existing schemas, and query this “Web of Information” in an effective manner.

  12. XML Query Processing in P2P

    The advent and popularity of the World Wide Web (WWW) has enabled access to a variety of semi-structured data and, when available, this data follows some common XML schema. On the other hand the distribution of content has made centralized solutions inappropriate, entering the era of peer-to-peer (P2P) computing, where content is stored in XML databases residing on peers. XML schema caching can be used as a summary indexing technique for searching in P2P networks. This project is to study XML query routing in structured and unstructured P2P networks and to compare different search strategies and propose a new strategy.

  13. View Maintenance for P2P Data

    Data integration is either done as Global-as-view (GAV) or Local-as-view (LAV). Due to the dynamic nature of data on the peer to peer network there is a need to maintain the mediated schema dynamically. Use caching and indexig techniques to maintain the views.

  14. Trusted Collaboration

    The ongoing, rapid developments in information systems technologies and networking have enabled significant opportunities for streamlining decision making processes and maximizing productivity through distributed collaborations that facilitate unprecedented levels of sharing of information and computational resources. Emerging collaborative environments need to provide efficient support for seamless integration of heterogeneous technologies such as mobile devices and infrastructures, web services, grid computing systems, various operating environments, and diverse COTS products. Such heterogeneity introduces, however, significant security and privacy challenges for distributed collaborative applications. Balancing the competing goals of collaboration and security is difficult because interaction in collaborative systems is targeted towards making people, information, and resources available to all who need it whereas information security seeks to ensure the availability, confidentiality, and integrity of these elements while providing it only to those with proper trustworthiness.

    This project can have many subprojects:

    • define access control models for and mechanisms for collaboration environments
    • Privacy control in collaborative environments, Security and privacy issues in mobile collaborative applications,
    • Trust models, trust negotiation/management for collaborative systems


  15. Data Engineering for Blogs, Social Media and Web 2.0

    Social media systems such as weblogs, photo and link sharing sites, wikis and on-line forums are estimated to produce up to one third of new Web content. Several attributes set these "Web 2.0" sites apart from traditional Web pages and resources: they are often annotated with semantic metadata, they are intertwined with human social networks and their constituent parts exhibit a rich set of relations and connections through comments, trackbacks, advertisements, tags, and other metadata. Projects involve developing models of these new information sources, understanding how to manage them, and developing techniques to extract useful information from them

    Projects include: graph analysis, trust and reputation in open media, data mining of social media, crawling and indexing, ranking and influence measurement of blogs/bloggers, cross media link detection and analysis, opinion and sentiment detection, measuring and predicting social media phenomenon such as "buzz spread", metadata detection topic detection and tracking.

  16. Web Warehouse Design for P2P Search Engine

    Many peers store data and have their local warehouse of web data. Other peers (independently) can retrieve information. Typically, peers search the information using a centralized server. However, in P2P warehouse environment, there is no central repository. Since each web warehouse at peers support and store web documents, we can make use of this cache to do searching in P2P manner. In this project, you have to design the architecture of P2P web warehouse searching system that supports searching and manipulation of web documents.

  17. Web Delta Mining

    This project will investigate the issues in mining the web changes for semantic knowledge. This requires mining the changes occurring on web pages.

  18. Storing XML and Detecting Ordered Changes

    Here you map XML data to relational model and detect changes in two versions of XML data using relational operators. The idea here is that we would like to detect changes from the relational database once XML data is mapped to relational model. You need to know Java/XML very well.

  19. XML Data Security and Access Control

    In this project, you are required to develop model to access control different parts of the XML documents. You need to map the controlled part to relational data separately.

    References:

  20. Searching Web Services

    Web searching is currently limited to keyword based document searching. Many software modules need searching by input and output parameters.

    References: