Social media data inzichtelijk maken zonder privacy te beschadigen

Inhoudsopgave

Over het project

Within the Amsterdam university of applied science teachers and students are working on developing a social media platform for project based picture sharing. These pictures form a mind map of photos about a given subject. As a follow up project our team was to make the social media data insightful and explore patterns without interfering with users their privacy. The social media platform is known as SnappThis and is available on IOS and Android.

Probleem opgave

Analyzing behavioral and usage patterns for a social media is difficult but could provide great insights into how the platform could be improved. How can we analyze and represent data of users and explore these patterns without interfering in users their privacy?

Voorgestelde oplossing

The initial proposed solution differs greatly from the final solution which is typical in software development but also is greatly due to our development method. In our initial solution the product would convert data exports from the SnappThis application to a database which the product could more easily use. Once the data is imported is multi step search, filter and select process would initiate. In which the user can tune the desired data set. The final set of data could then be displayed in graphs or table it would be up to the user to select the way to represent the final data set.

Iteratief ontwikkelen met SCRUM

The product was to be developed using the common iterative SCRUM development process. This product development cycle would be split into 4 sprints. Each sprint with a duration of 3 to 4 weeks. At the end of each sprint the current product would be reviewed with the product owner. During this review new ideas and changes to the product can be discussed.

Het product

The final product contained only a 2 step search and select process instead of the more advanced 5 step process. However this easier selection processed allowed for the development of column based searching. In which multiple searches could be executed at once and compared across a horizontal view. It is imported to note that the data in the images of the social media might seem odd, that is because all data has been replaced with fake data for the privacy of SnappThis their users. The actual dashboard will contain actual data. The product also includes permissions based on users from the original database although passwords are not shared between applications.

HvA project EWA - loginLogin screen.
HvA project EWA - search pageMain search page with menu and a single column.
HvA project EWA - import pageData importation page which is only accessible to administrators.

Multi column search


Here we show a single column on the main search page. In this column we can select multiple groups at the same time. The results are inclusive so if you select multi groups you will get all the snappmaps and in turn all the pictures for those snappmaps. You can however, also specify a selection of snappmaps. The resultset will then be limited to pictures who are part of the selected snappmaps.After we have finalized the selection of one column we can create additional columns or delete them if we end up not using them. The initial column can never be deleted as it used to create additional columns and no resultset can exist without at least one column. Once the selection of columns, groups and snappmaps has been completed the resultset can be made by pressing the filter button.

The resultset

The results are grids of pictures from the selected groups and snappmaps divided into columns. The columns of the resultset match the columns on the searchpage. This enables direct comparisons of the pictures. Each individual picture can be selected from column. If multiple columns can contain the same pictures. If the same picture is selected multiple times the image will be marked in red. The final selection of images can be generated and stored in to a collage. The parameters for this collage such as the number of pictures per row are configured by the user.


The short video above displays the fast and fluent interactions of the product while selecting images from multiple of the same resultsets. The prevention of selecting duplicates images as well as turning the selected images into a collage and tuning the collage settings. This concludes the information about our product from a user perspective if you are interested in how the product works below the surface I suggest you keep on reading.

Software technieken

In a nutshell: The product uses a large collection of software frameworks and libraries which have been extended and tuned to meet the product goals. These frameworks include Hibernate ORM, Spring MVC 4 and Admin LTE

There are more libraries and frameworks but these are the most important. Spring MVC was provided as a requirement for the assignment by the University however our team was not impressed with the documentation and certain design choices of Spring.

AJAX API

This lead to the development of a custom API which allows for communication between the browser and server through AJAX. The AJAX requests are encoded in the JSON data format. The API allows for the creation of hibernate queries based on an array of parameters which can be selected and defined in Javascript on the client side. The server validates the request and performs various permission and validity checks. After this it disassembles the JSON array into a hibernate query. This communication channel with javascript and the webserver allows for realtime modifications and interactions with the website without reloading the webpage.

Diagram of class and system components which realize the AJAX api on the server side
Diagram of class and system components which realize the AJAX API on the server side. This diagram also shows database and file storage intergrations.

Both the Javascript and Java side of the AJAX API try to conform to the same model although they both have their own definition. Because of the differences in the programming languages they also differ in the way they are written.

Java / Javascript model comparison
project ewa javascript api excerpt facebook

Importing and converting data

The use of hibernate has greatly simplified the conversion of JSON data into MySQL. Because all the identifiable entities are made into a hibernate entity class. By creating instances of these classes and filling their attributes from the disassembled JSON. The product can consolidate the instances of these entities in a many common databases as long as hibernate has a driver for this database type.

Before the consolidation all entities are stored in a large hashset. This hashset automatically ensures the dataset won’t contain duplicates. After the final hashset is completed for every hibernate entity all the data is consolidated.

After the consolidation every images is downloaded locally to dramatically increase the speed of collage generation. The total collection of images is fairly large and has to potential to grow rapidly as common in many other social media’s. Because of this large set the product uses a scaling multi threaded image downloader. This custom multi threaded downloader divides the total amount of pictures into smaller lists of pictures. The process of dividing a array into multiple smaller array is known as ‘chopping’. Each thread will be assigned a sublist of pictures to download, A thread supervisor will wait to continue with the next stage of importation until all threads are finished. Threads wont re-download images which are already downloaded while running a previous importation. This greatly reduces possible overhead and additional bandwidth while incrementally importing datasets.

The final stage of importation is copying all hibernate entities for pictures over to a special reference table in which the can be associated with generated collages. This was done in order to keep the original hibernate entities entirely based upon the provided JSON data.

Generating collages

Collage generation is done with a small set of native Java classes from the util namespace. Such as Graphics, File and ImageByteBuffered. These classes allow to read the locally stored images as bytes and offset them to our desires. This has allowed us to generate collages with relatively simple math which makes the generation of collages very efficient. The locally stored images also greatly improve observed performance because of the lower time necessary to load an individual picture.

Mijn contributie

During the project I played a big an important role in the software development cycle. Leading discussions with the product owner and keeping a oversight of the project. I also made important software artifacts such as: the entire API both the javascript and java implementation. This includes converting JSON into a hibernate query, The entire JSON conversion and importation except for the structure of the hibernate entities and I contributed to the collage generator. I also created a guideline for our sprint reviews which improved individual members their understanding of the project and improved code quality. I am very satisfied with my contributions and the things I have learned during this project. I think our product is capable of providing great insight into the usage of the SnappThis social media without users having to be concerned about their privacy.

Referenties

Gallerij

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *

*