nsb-toolbox

Utilities for editing Science Bowl questions


Keywords
science-bowl
License
Other
Install
pip install nsb-toolbox==0.5.2

Documentation

NSB Toolbox

A command-line utility for formatting Science Bowl questions

Version 0.5.2 - Updated for 2024 NSB

The NSB Toolbox contains a set of tools to make it easier to write and edit Science Bowl questions. It ensures that questions are compliant with the official Science Bowl format, allowing writers to focus on just writing the questions. It also highlights common formatting errors for editors, allowing them to focus on checking content without worrying that they're missing formatting issues here and there.

Table of Contents

  1. Installation
  2. Documentation
    1. nsb format
      1. Auto-Formatting
      2. Linting
    2. nsb assign
      1. Sample Configuration
    3. nsb make
  3. Known Issues

Installation

Currently, the NSB Toolbox can be installed via pip from this github. To do so, you will need:

  • Python 3.8 or greater installed on your computer.
  • Enter and run pip install nsb-toolbox in your command line.
  • Verify the installation worked by running nsb -h in your command line. If the help information for the toolbox appears, the installation was successful.

Documentation

You can access the NSB Toolbox via the nsb commandlet. Running nsb -h displays the following help menu.

(base) PS C:\Users\rishik> nsb -h
usage: nsb [-h] {format,make} ...

Utilities for managing Science Bowl .docx files.

optional arguments:
  -h, --help     show this help message and exit

subcommands:
  {format,make}
    format       format a Science Bowl file
    make         make a Science Bowl table

nsb format

nsb format provides two functions in one - first, it is a formatter than ensures Science Bowl questions are properly spaced (four spaces between question type and start of stem, blank line between stem and answer, etc). Second, it is a linter that highlights questions that it cannot fix. It is important to note that nsb format cannot catch every problem with the question! For example, nsb format will never be able to check question content for correctness. All nsb format can do is eliminate or highlight typical formatting errors.

Usage

nsb format takes a single mandatory argument, the path to the target .docx file. For example:

nsb format path/to/nsb/questions.docx

It also takes an optional argument, --capitalize. If given, nsb format will force every answer line to be capitalized. For example:

nsb format path/to/nsb/questions.docx --capitalize

It also provides a --line-after-stem optional argument to add a blank line after the stem in Multiple Choice questions (the official style prior to 2023). For example:

nsb format path/to/nsb/questions.docx --line-after-stem

Auto-Formatting

nsb format outright fixes a number of formatting errors. It strives to produce questions that have the following characteristics:

  • The question class (TOSS-UP, BONUS) is uppercase.
  • Subject (Biology, Chemistry, etc.) are title case.
  • Question type (Multiple Choice, Short Answer) are italicized and title case.
  • There are four spaces between the question type and the start of the stem.
  • For multiple choice questions, the stem and choices are each separated by a single paragraph break.
  • There are two paragraph breaks before the answer line.
  • The answer line is uppercase, assuming the --capitalize argument is used.

Notably, nsb format operates by moving, copying, and inserting XML elements. This ensures that user-provided formatting won't be overwritten (for example, superscripts and subscripts for mathematical formulae).

For example, all of the following improperly formatted questions:

Before Formatting

nsb format will automatically convert these questions to be compliant with the Science Bowl format:

After Formatting

Shorthand notation can also be used to reduce the amount of time writers spend writing boilerplate.

Before Shorthand

TU and B will be converted to TOSS-UP and BONUS, respectively. The shorthand for the subject categories is the first letter of the subject, aside for Earth and Space (ES) and Energy (EN). MC and SA will be converted to Multiple Choice and Short Answer, as well.

After Shorthand

Finally, nsb format will automatically correct minor errors in question structure. For example, the following question has multiple X) choices:

Before Multiple Choice Correction

The mislabeled choices will be automatically corrected. Note that an answer line that has been explicitly given will not be auto-capitalized:

After Multiple Choice Correction

Linting

If nsb format fails to parse a cell, it will raise linting errors by highlighting the question and printing the error in the command line. There are two levels of errors: parsing errors, which will highlight a cell red, and question structure errors, which will highlight the problematic structure yellow. nsb format searches for the following errors:

  • The question has a class, subject, type, stem, and answer. Multiple Choice questions should also have four choices.
  • Question type is correctly labeled - Multiple Choice questions should have choices, Short Answer questions should not.
  • For multiple choice questions, the wording of the answer line should match the wording of the choice.

For example:

Linter Errors

The first question is missing two choices, so it can't be fully parsed, raising a red error. The second question is merely mislabeled - it says it's a Multiple Choice question, but is recognized as a Short Answer question. This raises a yellow error, highlighting the question type. Messages corresponding to these errors are printed in the terminal, as well:

(base) rishi@RISHI-DESKTOP:~$ nsb format after_format.docx
Question 6: Couldn't parse question, was looking for QuestionFormatterState.CHOICES
Question 7: Question type is MC, but has no choices.

nsb format is not capable of deleting lines that contain text. This is intentional - while there are errors that nsb format highlights that it could probably fix automatically, the maintainer believes it is more prudent to leave whitespace formatting to nsb format and making any other changes by hand.

nsb assign

nsb assign uses a set of configuration options to automatically assign a set of edited questions to rounds.

Usage

nsb assign takes two mandatory arguments, the path to the edited set of Science Bowl questions and the path to the configuration file. For example:

nsb assign path/to/nsb/questions.docx -c path/to/config.yaml

It also takes an optional --dry-run argument, which will report statistics on the assignment procedure, but will not write the successful assignment to disk. This is primarily useful for determining what additional questions of a given difficulty or type are required. For example:

❯ nsb assign Kulkarni_HS_Chemistry_Regionals.docx -c assign.yaml --dry-run
Set      LOD       TUB       Need      Have
HSR       1       BONUS         8        40
HSR       1     TOSS-UP         8        46
HSR       2       BONUS         0        21
HSR       2     TOSS-UP        10        31
HSR       3       BONUS         0        22
HSR       3     TOSS-UP         0        30
HSR-A     1       BONUS        25        10
HSR-A     1     TOSS-UP        25         5
HSR-A     2       BONUS        23         3
HSR-A     2     TOSS-UP        30         5
HSR-A     3       BONUS        13        13
HSR-A     3     TOSS-UP         6         9
HSR-B     1       BONUS        25         7
HSR-B     1     TOSS-UP        25         6
HSR-B     2       BONUS        23         6
HSR-B     2     TOSS-UP        30         4
HSR-B     3       BONUS        13        13
HSR-B     3     TOSS-UP         6         9
Not writing assignments as this is a dry run.

Sample Configuration File

Below is a sample configuration file for a High School Regional set. The sections are explained in more detail further below.

Configuration:
  Shuffle Subcategory: True 
  Shuffle Pairs: False 
  Shuffle LOD: False
  Random Seed: ~
  Subcategory Mismatch Penalty: 1
  Preferred Writers: []

Round Definitions:
  Tiebreakers:
    TU:
      LOD: [2]

  RoundRobin:
    TU:
      LOD: [1, 1, 1, 1]
      Subcategory: [Organic, ~, ~, ~]

    B:
      LOD: [1, 1, 1, 1]
      Subcategory: [Organic, ~, ~, ~]

  DoubleElim1-4:
    TU:
      LOD: [2, 2, 2, 2]

    B:
      LOD: [2, 2, 2, 2]

  DoubleElim5-6:
    TU:
      LOD: [2, 2, 2, 2]

    B:
      LOD: [2, 2, 3, 3]

  DoubleElim7-9:
    TU:
      LOD: [2, 2, 3, 3]
    B:
      LOD: [2, 3, 3, 3]

Sets:
  - Set: [HSR]
    Prefix: RR
    Rounds: [1, 2]
    Template: RoundRobin

  - Set: [HSR]
    Prefix: TB
    Rounds: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    Template: Tiebreakers

  - Set: [HSR-A, HSR-B]
    Prefix: RR
    Rounds: [3, 4, 6, 7, 8]
    Template: RoundRobin

  - Set: [HSR-A, HSR-B]
    Prefix: RR
    Rounds: [5]
    Template:
      from: RoundRobin
      add:
        TU:
          LOD: [1]
        B:
          LOD: [1]

  - Set: [HSR-A, HSR-B]
    Prefix: DE
    Rounds: [1, 2, 3, 4]
    Template: DoubleElim1-4

  - Set: [HSR-A, HSR-B]
    Prefix: DE
    Rounds: [5, 6]
    Template: DoubleElim5-6

  - Set: [HSR-A, HSR-B]
    Prefix: DE
    Rounds: [7, 8, 9]
    Template: DoubleElim7-9

Configuration Options

Shuffle Subcategory: Setting this option to True randomizes the order that any subcategory specification appears in a round. For example, if you have requested ["Organic", ~, ~, ~], setting this option to True makes the "Organic" subcategory uniformly distributed rather than the first question of each round. Note that setting this option to True breaks any matching between TU and B subcategories.

Shuffle Pairs: Setting this option to True adds a randomization step after each pair of questions has been constructed. This is useful when you have intentionally matched TU and B subcategories, for example, but want to randomize the order that the subcategories appear in each round. Note that even when this is enabled, the final pair of each round will be Short Answer questions.

Shuffle LOD: Setting this option to True randomizes the order that question difficulties appear in each round, similar to the above options.

Random Seed: Setting this option to an integer fixes the generated assignment, all else being equal. If left unspecified or set to None, the assignment will have a slight random element to it.

Subcategory Mismatch Penalty: Setting this option to an integer specifies how much cost is incurred by creating a subcategory mismatch. Common options include:

1, which says that matching a question with the right difficulty but wrong subcategory is equally costly as using a question whose difficulty is off by 1, but the subcategory is correct.

2, which says that matching a question with the right difficulty but wrong subcategory is always less preferable than using a question with the right subcategory but off-by-one difficulty, but also always more preferable than using a question with the right subcategory but off-by-two difficulty.

Preferred Writers: If specified, any writers NOT in this list are given a small penalty, encouraging the optimization algorithm to use the preferred writers. This penalty is very small and should never result in a question of the wrong subcategory or difficulty from a preferred writer being used over a question of the right subcategory and difficulty from an unpreferred writer.

Round Definitions

Round Definitions:
  Tiebreakers:
    TU:
      LOD: [2]

  RoundRobin:
    TU:
      LOD: [1, 1, 1, 1]
      Subcategory: [Organic, ~, ~, ~]
    B:
      LOD: [1, 1, 1, 1]
      Subcategory: [Organic, ~, ~, ~]

Round Definitions serve as templates to build round specifications. Each round definition needs to specify the question types it uses (TU and/or B for TOSS-UP and BONUS) and the Level of Difficulty of each question. Optionally, the subcategories can be specified. Entering a ~ indicates that there is no subcategory preference for that slot.

To explain the above specification in plain English, we want all Tiebreaker rounds to consist of a single TOSS-UP question with a difficulty of 2, and all Round Robin rounds to consist of 4 TOSS-UPs and 4 BONUSes that each have a difficulty of 1. Finally, a quarter of TOSS-UPs and BONUSes should use the "Organic" subcategory.

Sets

Sets:
  - Set: [HSR-A, HSR-B]
    Prefix: RR
    Rounds: [1, 2]
    Template: RoundRobin

The Sets section actually specifies what rounds will be built. A set is defined with the Set, Prefix, Rounds, and Template keys. The Set, Prefix, and Rounds keys are round meta-data, while the Template key will build the round using the matching entry in the Round Specifications section.

These keys are used combinatorially - the above set will generate 4 rounds: HSR-A RR1, HSR-A RR2, HSR-B RR1, and HSR-B RR2.

Optionally, Sets can use the from:, add: syntax:

  - Set: [HSR-A, HSR-B]
    Prefix: RR
    Rounds: [5]
    Template:
      from: RoundRobin
      add:
        TU:
          LOD: [1]
        B:
          LOD: [1]

This syntax specifies that the RoundRobin template should be used, but there should be an extra TOSS-UP and BONUS that each have a difficulty of 1.

nsb make

nsb make produces a blank Science Bowl question table with a designated number of lines. This is a convenience function for writers. nsb make -h shows the following help menu:

(base) PS C:\Users\rishik> nsb make -h
usage: nsb make [-h] [-n NAME] [-st {HSR,HSN,MSR,MSN}] [-su {B,C,P,M,ES,EN}] path rows

positional arguments:
  path                  path to the Science Bowl docx file
  rows                  number of rows in output table

options:
  -h, --help            show this help message and exit
  -n NAME, --name NAME  Last, First name of author
  -st {HSR,HSN,MSR,MSN}, --set {HSR,HSN,MSR,MSN}
                        Set
  -su {B,C,P,M,ES,EN}, --subj {B,C,P,M,ES,EN}
                        Subject

For example, to create a table for 120 high school regional Physics questions for author: "Kulkarni, Rishi" the following command would work:

nsb make -n "Kulkarni, Rishi" -st HSR -su P Kulkarni_HS_Physics_Regionals 120

Known Issues

  • If nsb format is used on a document with tracked changes, it will assume the changes were accepted.

Please report any other issues you find on Github.