Study Design and Analysis

The Study Design and Analysis area contains 6 sections:

Sample Sizes

In the New Sample Size section enter the value of the sample size to be used in analysis in the Sample Size text field and click on Save in the top right of the section. Repeat this for each sample size required for the study design. Sample Size must be less than or equal to the Max Cohort Size set in the Virtual Population section.

For each sample size requested the samples will be randomly selected from the Virtual Population, generated during the previous KerusCloud step.

How many patients does KerusCloud generate?

To ensure there is enough patients for any potential combination of options that a user will request, KerusCloud generates the maximum number of patients that can potentially be required for a given study design, allocation ratio and recruitment parameters.

This ensures the generated data has the requested distrubition parameters and correlation structure.

Max Cohort Size X Number of Levels in Group X Number of Sites

If you wanted to assess the probability of success for sample sizes of 50, 100 and 150 in your project;
define the Max Cohort Size was defined as 150. This project would that run allow you to compare all scenarios, study designs, allocations, analyses and decision criteria for each sample size.

In this case the analyses based on a sample size of 150 will use the complete set of data generated in the Virtual Population, the analyses based on the sample size of 100 will randomly select 100 samples from the Virtual Population and likewise the analyses based on a sample size of 50 will randomly select these 50 samples from the Virtual Population. This process is repeated for the number of requested simulations, where each simulation recreates a new Virtual Population containing a number of patients equal to the Max Cohort Size

Site Caps

A Recruitment type variable must exist in the project before you can enter Site Cap Details. In the New Site Cap section enter a name in the Site Cap Identifier box and a description in the Description box. Enter a value for Site Caps for each defined Recruitment Site, the sum of all Site Cap values across all sites must be at least 100%

Allocations

The allocation section defines the ratio of randomisation between the levels of the Group variable.

In the New Allocation section click on Allocation Label. Type in a name for the label. Specify if the allocation is to be defined by Population or by another category of subgroup. If population is specified, then the Group variable is used to define the allocation ratios. If other categorical variables are used, then the allocation ratio is defined for each level of the category along with each level of the Group variable. This is commonly referred to in clinical study protocols as a stratified randomisation. For example, categories of a previous treatment can be equally distributed across two treatment arms to ensure balance in the patient populations. In the drop- down box under Allocation Inputs enter a numeric value for the ratio of allocation between the groups and click on Save in the top right of the section.

Analysis

The user will be asked to provide an Analysis Identifier, which will be referred to later in the Decision Criteria section to select the analysis from which a decision metric is calculated. This is a short alpha numeric character string no more than 16 characters in length. Once a name is selected, the user then needs to select the response variable in Select Variable that they would like analysed. The third input is the type of analysis selected in Select Test. Appropriate options will depend on the outcome variable type and the type of comparison required. Many analysis types allow the user to utilise the Defined By option, which specifies whether the analysis is carried out on the full population or individually on each level of a discrete variable. The types of analyses are described below.

Summary

The summary analysis provides summary statistics for a quantitative variable.

For Continuous, Multinomial, Poisson and Negative Binomial variables, this includes the Number In Analysis, Minimum, Maximum, Mean, P(Mean), Median, Standard Deviation, Lower Limit of the Confidence Interval for the Mean, and Upper Limit of the Confidence Interval for the Mean. The two-sided confidence interval is calculated using the formula 𝑥̅± 𝑡(1−𝛼/𝑠,𝑛−1) × sd where 𝑥̅ is the full sample mean, or the group mean, and sd is the standard deviation, 𝑡(1−𝛼/𝑠,𝑛−1) is the (1 − 𝛼/𝑠)th percentile of the t distribution with n - 1 degrees of freedom, (1 − 𝛼) is the level of confidence and s is the number of sides.

For Binomial variables, this includes the Number In Analysis, P(Proportion), Proportion, Lower Confidence Interval of Proportion and Upper Confidence Interval of Proportion. Similarly, a Summary analysis can be performed on data from a specific time point for an Irreversible Event variable by selecting the time of interest from the drop-down menu.

A Summary analysis can be performed on a Recruitment variable, in which case the following metrics are available: Count per site, Proportion per site, Maximum Time per site, Minimum Time per site, Mean Time per site, Median Time per site, Number In Analysis, Number Recruited at Given Study Time, Study Time to Given Number Recruited, Overall Count, Overall Maximum Time, Overall Minimum Time, Overall Mean Time, Overall Median Time.

A Summary analysis can be performed on a Time-to-Event variable, in which case the following metrics are available: Number In Analysis, Study Time to Given Number of Events, Number of Events at Given Study Time, Study Time to Given Number Censored, Number Censored at Given Study Time. For this variable type, the following metrics are available which provide a summary of the associated Kaplan-Meier curve: Event Proportion at given Time, Time to given Event Proportion, P(Event Proportion at given Time), Lower CI of Event Proportion at given Time, Upper CI of Event Proportion at given Time, P(Time to given Event Proportion), Lower CI of Time to given Event Proportion and Upper CI of Time to given Event Proportion.

One Sample T-Test

A one-sample t-test can be used to test whether a population or group mean is different from, greater than, or less than a hypothesized value. The null mean is set by entering a value for Expected Mean under Analysis Details.

Two Sample T-Test

A two-sample t-test would be a common method of analysis for the comparison of two independent group means. The default analysis assumes that the variances in the two groups are not equal and performs a Welch’s two-sample t-test, calculating the degrees of freedom using the Welch-Satterthwaite equation. Selecting Equal Variance Assumed under Analysis Details will result in the standard error calculated from the pooled variance and the degrees of freedom equal to n1 + n2 – 2. The default expected difference between the mean of the two groups is equal to zero. This can be changed by entering a different value in the Expected Difference field.

ANOVA

An ANOVA can be used to compare two or more group means when the data are assumed to be normally distributed. This analysis assumes that the group variances are the same. The user can select from Tukey, Sidak or Bonferroni adjustment methods to adjust p-values for multiple subgroup comparisons or can select None if no adjustment should be performed. By default, for an ANOVA Defined By variable with n subgroups the number of comparisons is calculated as n choose two for the Sidak and Bonferroni adjustment methods. This number can be changed by editing the Number of Comparisons box. Under Decision Criteria, when basing the criterion on the p-value, two subgroups need to be specified. The p-value is based on the Tukey adjusted pairwise comparison of these subgroups.

Linear Regression

The linear regression analysis fits a generalized linear model with Gaussian family and the identity link function. The method of fitting is iteratively reweighted least squares. The user is required to select variables that will be included in the regression model. Continuous variables will be included as covariates in the model, and a coefficient will be estimated for each of these variables. Group, binomial and multinomial variables will be treated as factors in the model. Interactions can be included by selecting two variables simultaneously in the Available Variables list and clicking on the “>” button. Variables can be removed from the Selected Covariates list by highlighting the variable and clicking the “<” button. Under Decision Criteria after a metric is selected, the user will need to select a covariate for which the metric will be calculated. If a Group, binomial or multinomial variable is selected, the user will need to select two subgroups to compare. The comparison provided will be for the difference in means. The metric can either be the p-value, the coefficient, a posterior probability for the coefficient, or the upper or lower confidence limit of the coefficient. The coefficient will give an estimate of the change in the response variable for a one unit change in the continuous covariate, or the difference of means of the subgroups in the case of a group variable.

Signed Rank Test

The Signed Rank test, or one-sample Wilcoxon Signed Rank test, can be used to test if the population median (i.e. central value of the distribution) is different, greater than or less than some hypothesized value. The median stated in the null hypothesis should be entered under Expected Difference, as this test is usually applied to paired differences. This test makes no assumption about the distribution of the data and can be applied to any discrete or continuous quantitative variable. Exact p-values will be calculated by default if the total sample is less than 50 . The normal approximation will be provided if the sample is greater than or equal to 50 . A continuity correction can by selected under Analysis Details to adjust the normal approximation, since a continuous distribution is being used to approximate a discrete distribution. The continuity correction is applied by reducing the difference between the expected and observed rank sum by 0.5 when calculating the z-score. The continuity correction is selected by default.

Mann-Whitney Test

The Mann-Whitney test can be used to compare two groups. It assumes that the two samples are independent and that the two populations have equal spread. It does not assume that the data are normally distributed, only that the two distributions have similar shapes. Exact p-values will be calculated by default if the total sample is less than 50. The normal approximation will be provided if the sample is greater than or equal to 50. A continuity correction can by selected under Analysis Details to adjust the normal approximation, since a continuous distribution is being used to approximate a discrete distribution. The continuity correction is applied by reducing the difference between the expected and observed rank sum by 0.5 when calculating the z-score. The continuity correction is selected by default. The default expected mean difference between the groups is 0. This can be changed by entering a different value in the Expected Difference field.

Kruskal-Wallis Test

The Kruskal-Wallis rank sum test can be used to test for differences in group medians, similar to a one-way ANOVA. No assumption is made regarding the distribution of the data, only that the groups are independent and that the group distributions have similar shapes. The Kruskal-Wallis test is approximated with a Chi-squared test, where the degrees of freedom are equal to the number of groups minus one. This approximation holds when the sample size in each group is 5 or more. Under Decision Criteria, the p-value criterion is based on the single p-value from the Kruskal-Wallis test.

Fishers Exact Test

The analysis expected here is to test for the association between two categorical variables, where each variable has two dimensions, and can be described in a 2-by-2 contingency table, where the group or binomial variable has two dimensions describing categories, and the outcome is a binary variable. The odds ratio in the null hypothesis needs to be provided. The value entered here is the value used to define the null hypothesis and hence tests for deviation away from this value. The default value is 1 which tests the null hypothesis that the effects of the two groups are equal.

Chi-Squared Test

The type of analysis expected for this Chi-squared test is in the form of a two-dimensional contingency table, defined by a group, binomial, or multinomial variable, and applied to a binomial or multinomial outcome variable. These two factor variables will be converted into a contingency table and the Chi-squared test applied to the counts in the two-dimensional table. By default, a Yates continuity correction is applied to the data. This can be deselected by clicking on the box next to Continuity Correction.

One Sample Proportion Test

An exact test for the null hypothesis regarding the probability of success in a Bernoulli experiment is performed. This is applied to a binomial outcome variable, where the level associated with zero is taken as “failure” and the level associated with one is taken as “success”. The test needs be defined by a Group, binomial or multinomial variable. Under Decision Criteria one subgroup from this variable needs to be selected, and the p-value or proportion will be calculated for this subgroup. By default, a Yates continuity correction is applied to the data. This can be deselected by clicking on the box next to Continuity Correction.

Two Sample Proportions Test

A normal approximation to the binomial test comparing two proportions is performed. By default, a Yates continuity correction is applied to the data. This can be deselected by clicking on the box next to Continuity Correction. The test needs to be defined by a Group, binomial or multinomial variable. Under Decision Criteria two subgroups need to be specified when a p-value or difference in proportions metric is selected. The default null hypothesis is that the two proportions are equal.

Logistic Regression

The logistic regression analysis fits a generalized linear model with binomial family and the logit link function. It requires a binomial response variable. The method of fitting is iteratively reweighted least squares. The user is required to select variables that will be included in the regression model. Continuous variables will be included as covariates in the model, and a coefficient will be estimated for each of these variables. Group, binomial and multinomial variables will be treated as factors in the model. Interactions can be included by selecting two variables simultaneously in the Available Variables list and clicking on the “>” button. Variables can be removed from the Selected Covariates list by highlighting the variable and clicking the “<” button. Under Decision Criteria after a metric is selected, the user will need to select a covariate for which the metric will be calculated. If a Group, binomial or multinomial variable is selected, the user will need to select two subgroups to compare. The comparison provided will be for the difference in means, which is a log odds ratio. The metric can either be the p-value, the coefficient, a posterior probability for the coefficient, or the upper or lower confidence limit of the coefficient. The coefficient will be the log odds for a continuous variable, and the difference of log odds of the two subgroups in the case of a group variable. This coefficient gives the log odds ratio between the two subgroups.

Poisson Regression

The Poisson regression analysis fits a generalized linear model with Poisson family and the log link function. The method of fitting is iteratively reweighted least squares. The user is required to select variables that will be included in the regression model. Continuous variables will be included as covariates in the model, and a coefficient will be estimated for each of these variables. Group, binomial and multinomial variables will be treated as factors in the model. Interactions can be included by selecting two variables simultaneously in the Available Variables list and clicking on the “>” button. Variables can be removed from the Selected Covariates list by highlighting the variable and clicking the “<” button. Under Decision Criteria after a metric is selected, the user will need to select a covariate for which the metric will be calculated. If a Group, binomial or multinomial variable is selected, the user will need to select two subgroups to compare. The comparison provided will be for the difference in means, which is a difference of two log counts. The metric can either be the p-value, the coefficient, a posterior probability for the coefficient, or the upper or lower confidence limit of the coefficient. The coefficient will represent a change in log counts for a one unit change in the continuous covariate, or the difference in log counts of two subgroups in the case of a group variable. The difference of the log counts can be interpreted as the log of an incidence rate ratio. To allow for overdispersion the user can select a Dispersion Estimate method to estimate a dispersion parameter. Selecting the Deviance method estimates dispersion as the model deviance divided by the degrees of freedom or selecting Pearson’s Chi- Square estimates dispersion as the Pearson’s Chi-Square statistics divided by the degrees of freedom. The default option of None uses a dispersion value of 1.

Cochran-Mantel-Haenszel Test

A Cochran-Mantel-Haenszel test analysis can be performed on a binomial outcome variable to compare between categories when the data is stratified. The comparison categories are specified by choosing the Defined By variable and the stratification specified by choosing the Stratified By variable. The Preferred Variance Method option can be changed to use either Robins or Koch variance methods. By default, a Yates continuity correction is applied to the data. This can be deselected by clicking on the box next to Continuity Correction.

Time to Event analyses: Summary

As described above, the Summary analysis on a Time-to-Event variable provides summary statistics for a Kaplan-Meier curve. This includes the event proportion at a given time (and lower and upper confidence intervals of this proportion) and the time at a given event proportion (and lower and upper confidence intervals of this time).

Time to Event analyses: Log-rank Test

The log-rank test can be implemented on a Time-to-Event variable, defined by a previously detailed discrete variable. There is also an option to stratify the log-rank test by another previously defined discrete variable. By default, the log-rank test will give equal weighting to each event (Weighting = “unweighted”). However, the following weighting options can also be implemented in the log-rank test:

Peto-Peto: the weight is taken as the estimated survival function across all individuals at time t. This leads to greater weight being attributed to events where the survival function is higher (earlier in the Kaplan-Meier plot) compared to when the survival function is lower (later in the Kaplan-Meier plot).
Gehan-Breslow: Otherwise known as the generalized Wilcoxon, the weight is taken as the number of individuals at risk at each time point, t. This leads to greater weight being attributed to events where a greater number of people are at risk (earlier in the Kaplan-Meier plot) compared to when less people are at risk (later in the Kaplan-Meier plot).
Modified Peto-Peto: the weight is taken in a similar way to the Peto-Peto weight, but multiplied by: nt/(nt+1), where nt is the number of individuals at risk across all groups at time t. This places even more weight on the early vs. late events than the Peto-Peto weights.
Tarone-Ware: the weight is taken as the square root of the number of individuals at risk at each time point. This leads to greater weight being attributed to events where a greater number of people are at risk (earlier in the Kaplan-Meier plot) compared to when less people are at risk (later in the Kaplan-Meier plot). However, as a result of the square root being taken, the effect of the weighting is less than the Gehan-Breslow weighting.

Time to Event analyses: Cox Proportional Hazards regression

A Cox proportional model is fitted, with the outcome variable a Time-to-Event variable variable and covariates included in the regression added one at a time for main effects, and variables added jointly for interaction effects. Efron approximation is used to handle tied event times.

Mixed Models

Mixed Models for Repeated Measures can be implemented on Time-to-Event variable variables. These models are available when a continuous RM is selected under Select Variable. There is one configuration of a mixed model for continuous repeated measures available:

Mixed Models with Time as a Factor Variable (equivalent to a Mixed Model for Repeated Measures (MMRM) model): In this model configuration the user is requested to provide all the terms that need to be included as fixed effects in the mixed model under Analysis Parameters. Time will be treated as a factor (or categorical) variable. The user should include all the main effects as well as the interactions between variables. Typically, this would include at least a main effect for the treatment and time variable, and an interaction term between treatment and time. Terms are added by selecting the term in the left list and clicking on the arrow button to add it to the model parameter list on the right. The user will also be requested to provide a covariance matrix structure for repeated measures from the same subject/experimental unit. The options available are:

a. Compound Symmetric: All covariance terms between repeated measures are equal (equivalent to a linear mixed effects model with a random intercept). The model will estimate two variance parameters

b. Autoregressive (with lag 1): The correlation between repeated measures is assumed to depend on the amount of time between measures, with measures further apart in time having weaker correlation. The model will estimate two variance parameters

c. Unstructured: A separate covariance term is estimated between each pair of repeated measures and for each of the variance terms along the diagonal of the with-subject covariance matrix. The number of variance parameters estimated will depend on the number of repeated measures per subject and can be calculated with the following formula: k = m(m+1)/2 where k is the number of variance parameters and m is the number of repeated measures per subject

Designs

At least one Design scenario is required per KerusCloud project. There are four Design Types:

• Fixed

• Futility

• Group Sequential

• Sample Size Reestimation

Where a study does not require an adaptive design, a Fixed design must be specified.

Fixed Design Type

On selecting a Fixed design, no further configuration is required. A maximum of one design of this type can be specified.

Futilty, Group Sequential, and Sample Size Reestimation Design Types

After selecting Futility, Group Sequential, or Sample Size Reestimation as the design type, the user selects an Interim Time Type from the following options: Percent Recruited, Study Time, Number Recruited and Number of Events. Users can then enter between one and ten Interim Timings.

Depending on the design type, one or more adaptive design rules will be available. Selecting these enables the user to input the design details.

Futility Designs

A Futility adaptive design has one mandatory Futility Rule. As part of the Futility Rule, the user selects an analysis that has been defined in the previous Analysis section. After selecting the desired metric and providing the required associated information, the result will be compared against the user-entered Interim Time Cutoff Values in order to determine futility, with a default assumption that the study is not futile.

When a project contains a Recruitment variable variable, Study Time is available as an Interim Time Type. In addition, a listing of quantitative metrics per recruitment site and overall (combined recruitment sites) are available when defining a Futility Rule based on a Recruitment-based analysis. The same list of quantitative metrics per recruitment site and overall recruitment sites will also be available when defining a Decision Criteria based on a Recruitment-based analysis.

Group Sequential Designs

A Group Sequential adaptive design has a mandatory Efficacy Rule and an optional Futility Rule. For the Efficacy Rule, the user selects a previously defined analysis, a suitable metric and specifies any required options for the analysis. The user can then choose from four algorithms for calculating the alpha spending function (Pocock, O’Brien-Fleming, Kim-DeMets and Hwang-Shih- DeCani) and needs to input an alpha parameter or alpha and gamma parameters, depending on the selected algorithm. The chosen analysis will be evaluated at all Interim Timings for the Group Sequential design.

By default, the optional Futility Rule is active for Group Sequential designs. This is defined in a similar manner to defining a Futility design. When this rule is active, the result at each Interim Timing will be evaluated for futility. The Active switch within the Futility Rule Definition section can be toggled to inactivate this rule.

Sample Size Reestimation Designs

When a Sample Size Reestimation adaptive design is selected the user must choose a previously defined analysis for the mandatory Sample Size Reestimation Rule. This uses the “Promising Zone” approach described by Mehta and Pocock to evaluate the conditional power at the interim and, if the conditional power falls in the “promising” zone, the sample size will be increased. The choice of analyses is limited to those where a Wald Z-score statistic can be calculated for the metric. After inputting any analysis options and selecting the Mehta and Pocock algorithm the user must input the algorithm parameters. The alpha and beta parameters specify the hypothesis test significance level and 1 – target power, respectively. The Maximum Limit is used to control the upper limit to which the sample size can be increased. There are two options for the Maximum Limit Type: Multiplier and Absolute Value.

Please note that the Sample Size Reestimation Rule is only evaluated at the final interim if multiple Interim Timings are defined. If the optional Futility Rule is active, it will be evaluated at all defined timings and if the study is futile at any interim for a simulation, the sample size will not be re-estimated.

Study results from each of the specified designs can be investigated in the Decision Criteria Visualisations page.

Analysis Review

Provides an overview of all sample sizes, allocation ratios, analyses and designs specified and displays the number of Kerus Credits which will be used. An analysis will be performed for each sample size, allocation, scenario (although scenarios are not shown in the Review page) and design option requested. In addition to this, for a Futility design type the analysis specified in the Designs tab under Futility Rule Definition will be carried out at each of the specified interim timings.

You can choose from three different speed options to run a task. After selecting a speed, the credit usage for the task will be updated, allowing you to select the option which best balances speed and cost for your needs. Find out more.

Once happy with the set up check the number of Kerus Credits which will be used and click on the Go button to carry out the Study Design and Analysis stage which the Decision Criteria will then be based on.