Skip to content

2025 Data Science Team Retreat

Brian Lee edited this page Nov 4, 2025 · 4 revisions

Data Science Off-Site Meeting

Wednesday 22 October 2025

Noon – 3:30pm

Agenda

Noon Depart from PSRC office
12:30pm Lunch
1:30pm Ice-breaker
1:45pm Determine items for discussions

  • From creating new tools to maintaining and making use of existing tools
  • Code review: when to do it and how?
  • Commenting within scripts and documentation elsewhere
  • Other items?

2:00pm Self reflections
2:15pm Group discussions
3:15pm Concluding thoughts
3:30pm End of off-site meeting

Presentation slide (this was the only slide used in the retreat)

DISCUSSION THEMES

  • Using/maintaining existing tools
  • (+ documentation & code review)
  • Tool development beyond Data
  • AI applications
  • Other?

Team Brainstorming

Bullets below were sent anonymously via Mentimeter
Double exclamation marks (“!!”) indicate additions from group discussions
Triple exclamation marks ("!!!") indicate additions from follow-up discussions

Discussion themes: reactions & other suggestions?

  • GitHub can be better organized
  • Projects with unstructured data would be exciting but would also be completely new ground
  • Better/more uses of Data Wiki?
  • Is there a place where all of the tools that have been developed are listed?
  • Very curious about the new agency file management system and how that may impact our work
  • Coding tutorials for data team
  • Guidance for using different spaces: GitHub, data wiki, teams, network
  • Is it necessary to create a template for documentation? Is there a certain level or amount of work/effort that requires documentation vs. exploration/temporary work
  • How is code review implemented? Create structure to follow?
  • Maybe also the human side to maintenance & documentation: training staff, helping get tools used (easily found, etc), so they can be integrated and not forgotten/reinventing the wheel.
  • +1 to more structure and guidance and expectations around documentation
  • Would all people be expected to conduct code review or would it be specific people? How does that work with other work responsibilities

Team determined to focus attention on GitHub, Documentation, and CodeReview

Theme 1: GitHub

  • Hard to find work by others
  • Appreciate the ability to collaborate and point people to code in central location
  • Wish we have guidance for naming conventions
  • Can utilize GitHub issues a bit more for tracking project development and collaboration
  • Difficult to sift through all of the code - no formal organizational structure
  • When working on a repo in a group, I would like a clear expectation as to the workflow. How are branches to be set up, merged, and when are PR’s to be filed, etc
  • When organized, great to have main branch and separate branches for development and experimentation
  • Great to be able to share code with others
  • Great tool for version control. Issue is more with repo organization than with Github. We may find some features we're not using to be very helpful.
  • Would like more training beyond basics (I think this would help others in data as well) - always nervous I’m going to break something
  • Pro: a common version control tool. We will always have the history of the project stored in the cloud
  • I forget that there is an optional wiki section to GitHub
  • Some folks (data applications) still work outside it (i.e. some help for staff adoption).
  • travel-studies repo is getting too large. How should we organize data analysis collaboration better?
  • !! Use system of tags to help organize
  • !! Repo owner or point person
  • !! Standards for branching, pushing code to repo
  • !! Outdated code? (travel-studies) archiving?
  • !!! Increase comfort level using GitHub; learn fundamentals/basics

Theme 2: Documentation

  • I know there's documentation on a project, it's just hard to remember where it is
  • It’d be great to have a common system for documentation at least throughout Data department. Better if it’s throughout the Agency.
  • Have saved documentation in teams, network, GitHub, etc. and would love to have one best practices spot, although hard depending on project and collaborators
  • It would be nice to have team guidance/agreement about expectations for procedure-level documentation (docstrings). Not so much for others but to hold myself accountable
  • Packages should have vignettes. Code documentation should live in the repo wiki. Project overview, cited resources & connections between repos in more extensive workfloe should happen on the MediaWiki
  • For Urbansim or Elmer related documentation i will always first go to the wiki, but other projects I'm not certain where it would be.
  • I’m not sure where some documentation belongs: on the Data Wiki, readme file in the repo, repo wiki, etc
  • Project development, work status tracking and work notes/ technical documentation can all live in public GitHub repo; more parcels sensitive database info in Data Wiki
  • We don't want this to become a chore/burden. Make a practice of explaining choices & capturing knowledge your future self (or others) will appreciate.
  • I appreciate well commented code. Not sure if code should be able to stand on its own or if it should always be accompanied by a readme file or some additional form of explanation
  • Use AI to help?
  • What do we want to achieve with documentation? Translate this into a set of problems we want to solve?
  • Help other staff find and use created by others
  • I always start documentation halfway through or at the end because it was supposed to be temporary. Is the time worth it to start some with each project/analysis?
  • AI is quite useful getting up to speed on a codebase you're not familiar with.
  • Documentation is currently piecemeal and scattered across different places/platforms
  • With new file storage will all staff have access to all projects?
  • !! MediaWiki contains stale information, refresh content?
  • !! During experiment stage, code gets long. How to decide what to keep? How to keep code lean/clean? Use issues to specify stage of code? Keep a separate on J: drive?
  • !!! How to enforce documentation? What level of documentations? What are the expectations? Situation/project dependent?
  • !!! Media wiki for knitting together projects; GitHub wiki for individual repos?
  • !!! Consider platform options
  • !!! Train chatbot to help users figure out whether something has been done

Theme 3: Code Review

  • What problems would we be solving with code review? 1) Verify concepts are correctly transformed into code. 2) Ensure understanding of what code does. 3) Help develop efficient code
    1. Reviewers learn. 5) solidify team efforts.
  • Finding a good balance of having code reviewed without burdening people with it
  • Code review seems great in theory but I find it hard to bother teammates to take time to review mine. Partly because it will take their time, partly due to different perspectives
  • During DS meetings, it would be good that sometimes people can share or present the code they are working on to the group and get feedback or just different sets of eyes for fresh perspectives
  • I prefer doing a targeted code review. If a codebase gets complex and there's no particular issue to address, it can be intimidating to review everything, and hard to know what to accomplish.
  • When I review code of others, I don’t know how extensively I should check, nor what is legitimate commentary vs my own personal style
  • AI can be useful for the mechanics. It can be helpful to have human code review for the underlying logic.
  • More formal process with expectations of request, timing, etc. would be helpful for managing expectations and work load
  • Who does it and for what? Differentiate between “experiment” vs “production” code?
  • I am always requesting Christy’s help and it requires a lot of time to sift through/introduce project/goals when she has other responsibilities
  • Differentiating between a review for confirming correct process vs improving process
  • !! Review requester should specify goal of review
  • !! Determine level of importance?
  • !! Burden on repo lead to merge multiple people’s work?
  • !! Python group code review example useful?
  • !! Multiple people working on same project = build in code review?
Clone this wiki locally