Posts

13-Weeks of Data Science for Social Good with University of Chicago

Data Science for Social Good (DSSG) is a paramount fellowship program helping data science enthusiast to develop interdisciplinary skills that aim at solving social good problems. DSSG fellowship from the University of Chicago has been involving students, mentors and project managers since 2013 with a motive to provide meaning to data generated in NGOs' (e.g. Ushahidi) and Healthcare organization (e.g. Jose De Mello Saude, Lisbon Portugal). As a selected DSSG fellow in 2017, I seek to apply my machine learning understanding to solve the problem in healthcare in Portugal along with a hybrid team of statistician and software engineers. On the contrary, I was able to build corporate data science skills, teamwork, and social skills which are even more important for the success of the project. 13 weeks time bound workplace of high communication, training and target oriented is summarized sense of DSSG. Before selection, I envisioned DSSG as a program where students (undergraduate or g...

Report of my trip to ISWC-2017, Vienna

Throwback at International Semantic Web Conference-2017 Vienna, Austria Link to the ISWC-2017 Trip :  https://docs.google.com/presentation/d/e/2PACX-1vR_weU1D1zMSgbh_J_VgwCHhm1c6Q7D3cM2-LGW1qGCSRoedaEvnvK3VmLc_Nns3ssg_P_C2JmsayYG/pub?start=true&loop=false&delayms=3000

Data visualization: how important is it to communicate results in an understandable way? Should we be explaining our models to the organizations and people we hand them to? Should there be full transparency regarding the decisions made in the work?

Data visualization plays the key role in the interpretation of the communicated information by the audience. A variety of methods can be used to communicate information including textual forms, however, these are often overwhelming to interpret due to complexity or/and amount of data. As a result, the audience can misinterpret the information which is why it is crucial to present data in a way which is easily understood. I think we should understand the target audience before deciding the inclusion and exclusion of the models. But, yes we should try to make people understand the working of the model using some user stories.  When we are handing over our models, the recipient can be a team from the business side and IT people. Business people will be interested in the results and IT people will be interested in "HOW" we have achieved those results?  Yes, I agree with this point. We should have transparency in the work. We should involve the people from the organiz...

What fields do you think would most benefit from Data Scientist help? Where can we make the most impact? What is there to gain in linking data science to public policy?

In this era of 21st century, the world is influenced by the concept of "Data Flood", "Data lakes" and "Dark Data". Every field, be it IT, psychology, neuroscience, geography, history, economics etc, generates a large amount of data for research. CAJAL program by Champalimaud foundation is an initiative towards uplifting the health cluster of Portugal and provide efficient health care services to mankind. Geographical data  ---  remotely sensed images provide spatial, temporal and aerial information of land helping natural resources data scientist to assess the topographical features of the land and policy for the land. For instance AIS data in various World Economic Forum Projects. Neuroscience --- Brain, Cognitive Science and Perceptual computing are three intertwined strands promoting artificial intelligence. What if we can use cognitive and perceptual computing together with domain knowledge to under the functioning of the brain. This helps us in bu...

Data Science in practice (real world) diff to DS online (Kaggle etc): what does the real data look like?

Data science in real world differs by varied factors from data science in Kaggle. Factors are : 1. Level of complexity of data: The number of tables used for decision-making in an organization is far far higher than used for machine learning in Kaggle. 2. Lack of clear objective in the real world: In Kaggle competition, the programmer knows what to do and how to do. In the real world, most of the time is spent in data analysis and cleansing. This helps us in developing the goal, that can be achieved in the stipulated time. 3. The sparsity of forums for help: Real world data problem cannot be solved by simply looking over the internet. They require careful though process over weeks so that you can develop a plausible strategy. 4. Expectation: Kaggle competition, lay their expectation clearly and it is calculated based on strength of the people participating. In Real world, the expectation of the organization changes every day, week or month, and requires the data scientist to ad...

PURPOSE FOR JOINING SOFTWARE DEVELOPERS NETWORK AT TOPTAL

Working smart is designed to take you from dreams distillation to an effective action plan that propels you towards meaningful goals -- Charlene Fike The future that I envisage for myself is that of being a lifelong scholar and developer carrying out path breaking research in the field of Machine Learning and Data Mining and sharing this knowledge with young curious minds through teaching. To realize this career goal and attain excellence in this field through continuous research and implementation of theoretical concepts in real life situations, I covet to join the community of software developer on Toptal. My primary interest focuses in applying data mining particularly graph mining in community detection in brain networks. I worked on a project that involves detection of communities in time across Normal and Alzheimer patients. I have strong interest in data mining, machine learning, computational intelligence and software engineering. I holds a BS degree with first cl...