Help Your Company Know What It Knows Part 1 - The Effective Statistician

Ever felt like your organization is drowning in data, but no one really knows where to find what they need?

What if I told you there’s a simple, yet powerful, solution to the chaos of storing and accessing statistical results?

Do you find it challenging to bridge the gap between storing data and making it easily accessible for everyone in your organization?

Ever wondered if there’s a tool that could not only store your statistical results but also make them visually engaging and accessible to various teams?

In our last episode, I delved into a problem that resonates with many organizations – the struggle to truly understand the wealth of information they have at their fingertips. But here’s the thing – knowing you have a problem is just the beginning.

Today, I’m going to take you on a journey, a journey where we unravel the complexities of transforming the way companies handle their statistical results. So, listen in, because we’re about to explore solutions that go beyond the surface-level issues.

I specifically talk about the following points:

Challenges with Traditional Data Storage
Introduction to Dataset Storage
Comprehensive Metadata Inclusion
Anticipation of Future Needs
Data Visualization for Accessibility
Interface Accessibility for Different Teams
Feedback Loop for Continuous Improvement
Benefits of the Process
Challenges in Implementation
Future Exploration in Next Episode

Now that we’ve uncovered the keys to transforming your organization’s approach to statistical results, it’s time to take action. Reflect on the challenges you’ve faced in data storage, accessibility, and collaboration within your organization.

Make sure to share this episode with your friends and colleagues that can benefit from this!

Transcript

Help Your Company Know What It Knows Part 1

[00:00:00] Alexander: Welcome to a new episode of The Effective Statistician. In the last episode, we talked about the problem that your company, your organization, likely doesn’t know what it knows. And all the different problems around it. Just to kind of sum it up. The problem is that you store all your different results, the means, the p values, and so on, in a way that it’s really difficult to know whether they exist at all. To locate them, to make sure that you find all of them, that you have access to it, and that you can work with it really, really fast and easy. Now, in this episode, I will talk about how you can change that. [00:01:00] And this is not easy. You know, just to put things into perspective, the process that I will talk about sounds simple, and yes, it actually is simple.

[00:01:17] But to put it into place is everything but simple. So, let’s start with it. First, you need to store your results, in addition to storing your patient level data, also in a similar way. You need to store your results in an easy, electronically, way. readable way. So instead of programming your results and then storing them in tables, you first store [00:02:00] them in a data set, in a database, you know.

[00:02:05] This is a very, very simple thing, and most programmers do that anyway. You know, they create first a dataset and then they create the table based on the dataset. And in this dataset says all the metadata sets they need to have, you know, the subgroup c. Studies population, whether this number is a mean or a sample size or lower limit of a confidence interval, all those different things they have in our data set or SAS data set, what server, what kind of platform you’re using doesn’t matter here.

[00:02:48] And then it is about storing this data set. And then your tables are just a view on this data set. [00:03:00] And you don’t store, or you can also store the tables of course, but you really need to store this data set. This data set. can then include all your metadata. Okay, your metadata can include, for example, or must include, for example, the study name, the compound name, the endpoint that you have analyzed here, the reference to where are the specifications for that, the analysis that you used, this is Logistic regression, or is it ANOVA, ANCOVA, whatsoever.

[00:03:42] What is this data? Is it a sample size? Is it an odds ratio? Is it a confidence? Is it a p value? From which population is it? From which subgroup is it? At which time point is it? [00:04:00] All these kind of different things. What imputation method have you used? All these kind of different things that you need to understand.

[00:04:08] You very likely have all this metadata in there anyway because you need to put it in your title, in your footnote, in your things that you put with an asterisk, what server. You can store in this metadata as much as you like. Limitation strings. You can even put their links to where the analysis specifications are.

[00:04:40] You can put their links into further references. Whatever. Everything you need. to really explain what this data is about, the metadata. And of course, here, in this dataset, you can also store the data, [00:05:00] you know, not rounded, for example. You can also include more things around the data. Let’s say, goodness of fit statistics, confidence intervals, additional p values.

[00:05:16] Additional ways to measure the treatment effects. So often we work with binary data, and then the only thing that we store is the proportions within the different treatment groups and subgroups and things like this. Instead of also storing the odds ratios, the risk difference, the relative risk. All these kind of different things, yeah, are for continuous data. Not just stores and means, but also the mean differences, the standardized mean differences, all these different things that you will need sooner or later.

[00:05:56] You can also include there, for [00:06:00] example, which kind of factors did your model include? All these kind of different things you can include in this dataset. And here it’s about being more comprehensive, so that it is really, really easy to use the metadata to understand and find, locate all the different things.

[00:06:30] This is where you can also put, can put things like hashtags or whatsoever. Text into it to more easily understand and find data. Whatever helps you to locate the data. When was it created? Yeah. Things like that. It’s easy. All these different things. So that it is really, really easy to understand what this [00:07:00] number is about. Or, well, it’s not a number. What this kind of text is about. If your data is a text. It is really, really important to have this metadata as comprehensive as possible and also to have the summary statistics as comprehensive as possible. Don’t just think about what you need for your CSR. Think about what is all needed thereafter.

[00:07:30] What will be needed from a medical affairs perspective? What will be needed from an HTA perspective? All these different things.

[00:07:39] It is about breaking down the silos. And yes, in big organizations we are big about silos. We just focus on the FDA and the EMAR. And maybe also about Japan and China, but that’s it. No, it’s more than that. It is also about making [00:08:00] sure that All the treating physicians know about it. It’s about understanding all the payers have the right evidence.

[00:08:07] And yes, they will ask for different evidence than the FDA. So here, to understand what you all need, you will need to work with a vast area of people, not just your clinical development team.

[00:08:26] Now, you store that in an electronically readable format, whatever format you want. You think is most useful for your organization, but you store it in that format together with all the metadata. Okay. That is step one. Step two is now you need to make it easy to locate all this data. And for that, there is a very, very easy tool.

[00:08:59] [00:09:00] Data visualization. Sounds simple, yet nobody does it, or very, very few organizations do it. Maybe they have some kind of tool here and there, but nobody uses these graphical interfaces to explore all the different data sets they have within their different, don’t have study, whatsoever.

[00:09:26] All the different other, you know, industries use data visualizations all the time, yet we as statisticians in the pharma industry focus so much just on tables. You need to have a really good graphical interface in place to search for all your metadata and then to be able to drill down into this data, to filter it, to sort it.

[00:09:56] To reduce it so that you can find [00:10:00] all these different things. And this graphical interface. Needs to be available, not just to your statisticians. This needs to be available to many more people because you don’t want, in the end to have people, you know, going to the statisticians says, ah, yeah, by the way, ah, can you help me finds this data again?

[00:10:26] And then your statistician again needs to kind of look for the data. And is it. data provider. And yes, you will need graphical interfaces for different people. Yeah, you will certainly need something that is of course, where you manage access, all these kinds of different things. And you will probably provide some kind of tools and templates, yeah, to make sure that people can much easier [00:11:00] and find the typical data they will look for.

[00:11:03] Yeah, let’s say your safety people. Yeah, they will probably look, use different interfaces than your, then the people that work on PK. Your HTA people will probably have. Different ways of looking into the data. Your higher management probably wants to have something set is specifically organized for their needs.

[00:11:33] This all can come from the same interface, just with different ways to look into the data. And here it’s about they will these interfaces have access to some metadata and see. Patient level data no, not the patient level data. The summary statistics, the sample sizes, the means, [00:12:00] the confidence intervals, all of that.

[00:12:02] They can display it in an easy and nice way. This is super helpful also for running through your analysis with your study team, yeah? For example, exploring subgroups, all these kind of different areas you can do with this interface.

[00:12:25] This interface, of course, you know, tables are also just another way to look into this data. You need to have set it up in such a way that You can replicate things really, really fast. That you can have a way that people can download things from there. So that they can export, let’s say, they’ve done an analysis, or, no, they don’t have done a new analysis, they have created a new view on the existing [00:13:00] analysis, and now they want to download it.

[00:13:03] Because they need to further processes. Then, of course, you need to have some way that you can download things from there.

[00:13:13] Make it easy to request new outputs from there, yeah? So if people, for example, want to say, ah, I need a new analysis here because that is not already here, have some kind of bottom or whatsoever in there that Makes it easy to request new analysis. This way you will get a much better understanding what is all needed across here.

[00:13:49] Maybe they want something that you have never thought about. Yeah, that way you will learn from all the different areas within [00:14:00] your company what they actually need. And your tool will become easier and easier, more powerful, and so on. You can also create then basically further follow up tools that provide online views to these kind of different things.

[00:14:22] Imagine you have your, let’s say, your sales people. Yeah. They will use your data all the time. Yeah. And what’s currently happening is that they use the CSR, a couple of different papers, internal slide sets, whatsoever, send it to a marketing agency, they put everything into some kind of nice html format or whatever what you’re using and then that is used on the iPads or tablets whatever your company is using with the sales reps and then [00:15:00] whenever there’s an update they need to go through all these kind of different things again of course a lot of money is involved and a lot of time resources are spent wouldn’t it be much nicer If you could create this more directly, and so that whenever there’s an update, let’s say you see that, oh, there’s a new analysis available, it would automatically read into all of that, and you wouldn’t need To spend months and thousands of euros, millions of euros probably, to make sure that all the different results and all the different stuff is updated around the world.

[00:15:50] Yes, in German, in French, in Italian, in Korean, and so on. Language would be so, so nice. And your organization would [00:16:00] have so much more impact. So there needs to be some kind of Feedback loop from the user to the stats department to make sure that this interface develops and gets better and better and better and more powerful and more powerful and more powerful.

[00:16:23] This will save you, as an organization, and I’m not talking about the stats organization, I’m talking about the company, so much money. It will help you to have so much more impact from your stats organization. So much more visibility. It will make things so much faster in terms of creating results. In terms of communicating all these things, there’s so many opportunities around it.

[00:16:57] And this process [00:17:00] looks simple, and yes, it is simple, but it’s really, really difficult to implement. Because if you want to implement that, you need to take all the different people that you have with you. You need to take all the different statisticians, the programmers, the medical writers, the people in the real world evidence teams.

[00:17:31] All the different people around the world together to make that change. And that is a hard part. Yes, of course, it’s not super simple to build this whole process and so on. And yes, it also takes some time to build the graphical interfaces and these kind of things. I absolutely see this. The biggest obstacle to putting this in place. [00:18:00] is creating the change within your organization and having people embracing this change and driving it forward. And in the next episode, I will talk about what you can do to create this change.