2022-05-15 at

Data Scientists should employ the Form of Journals

Data science being a corporate buzzword for computational statistics ... this post is really about how scientists should operate in commercial settings. Well, if you call yourself a data science team, then you should probably be doing science. Doing science is hard enough, let along persuading people that you should get paid for it - and this post regards the latter. 

Persuasion is a matter of information transfer, so this post is about information management. In order to be effective in their organisations, data scientists need to specify the methodology which they use for scientific operations. That methodology should be synchronised with the general method practiced by the broader scientific community, and so I advise that data scientists use a generic documentation format : the journal paper.

In fact before doing any real science, it is strongly advisable that a commercial science team should internally publish a mock journal paper for a number of reasons.

  1. Getting from zero to one, on executing a documentation methodology, is a lot quicker if you don't have to wait for the stars to align for commercial science to happen, but instead mock up data and show what sort of conclusions that would imply.
  2. The resulting mock paper serves as a training tool for science team members.
  3. The resulting mock paper serves as an evangelical brochure for non-scientists in the company.
If all documentation of science-ops is in the form of journal papers, that's a pretty good encapsulation. The company develops a library of papers, which serve as the repository of science-ops' docs. Whenever there is good documentation, and use of documentation, teams are able to rest more easily as they can dump their memories more effectively whenever they clock-out for the day. They can more easily reintegrate their working memories at the start of each day also.

IMPORTANT NOTES ON COMMUNICATION
  • journal papers are for scientific audiences
  • non-scientific audiences in commercial settings ( that means, all our non-science-oriented colleagues ) should receive an additional brief, "a summary of business impact, of this paper's conclusions", which should boil down to 5-25 words of 
    • "we did not learn how to make more or less money from this," or 
    • "we learnt that this makes/costs more/less money"
  • scientists should not be surprised, or discouraged, if non-scientific audiences don't want to understand the methodology, and only want to understand the summary of business impact (prepare yourselves!)
    • have this in mind when presenting to any non-scientific audience, and usually BEGIN WITH THE SUMMARY OF BUSINESS IMPACT
  • because we pick a community-standardised format & methodology, this also means that we make scientific operations easily accessible for audit by qualified members of the scientific community; this is applicable where
    • "internal audit" colleagues wish to check on the quality of scientific operations
    • "quality assurance" colleagues wish to understand scientific operations
    • any employee or member of the company wishes to learn more about science, and scientific operations
    • any external party wishes to assess the quality of scientific operations, for example in the event of due diligence during mergers and acquisitions

No comments :

Post a Comment