-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[VTA] [Hardware] Chisel implementation #3258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
868d3db
add hardware files
vegaluisjose 09d1067
add tests
vegaluisjose a2f4c6d
add rest of files
vegaluisjose 7bd77e5
add user finish option to tsim app
vegaluisjose 8b91bec
add cmake modifications
vegaluisjose c1b2374
make the sim target the default
vegaluisjose 2f75fd8
bugfix hardcoded parameter in DotProduct
vegaluisjose 6b1416c
add comments to the Core module
vegaluisjose 9793db1
fix typo
vegaluisjose d0cf997
improve wording
vegaluisjose 7b3f2de
generate and map alu opcode automatically
vegaluisjose 16d77e0
use string interpolation for printing ALU decode bundle
vegaluisjose a54ac3f
add license
vegaluisjose c0646b0
add Fetch doc
vegaluisjose cad584c
add doc to fetch, core, and decode unit
vegaluisjose d8da409
add more docs
vegaluisjose 8e4c5d7
add comments and over/under-flow check for inflight uops
vegaluisjose 3098372
add sim back
vegaluisjose 5b22edb
fix typo in debug message
vegaluisjose 5e67077
add more docs
vegaluisjose b3a760f
use list instead
vegaluisjose 0da7273
fix these as well
vegaluisjose 1284a3b
add doc to compute and update core
vegaluisjose a8fd84d
add doc to TensorAlu
vegaluisjose 87cce5f
add TensorGemm doc
vegaluisjose 6479114
add more documentation
vegaluisjose 8d0dff8
add more doc
vegaluisjose 6f7fd8b
add more docs to shell
vegaluisjose 3d4afc5
add doc for test and vta
vegaluisjose 0460120
add license
vegaluisjose 73d52e8
add feedback from linter
vegaluisjose 8c91ae3
add more feedback from linter
vegaluisjose d922234
more lint feedback
vegaluisjose 0cd7df0
add space
vegaluisjose 6b3cfa2
fix another one
vegaluisjose 88d32dd
tsim unittest prototype
vegaluisjose 991cec2
remove tsim unittests, they are now integrated with others
vegaluisjose d69feea
string strip can be added as an option on execute process
vegaluisjose ec6698e
remove USE_VTA_TSIM switch, it is not needed anymore
vegaluisjose b572100
add doc to tsim init
vegaluisjose c7b7a32
check if file extension is provided
vegaluisjose File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,201 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package vta.core | ||
|
|
||
| import chisel3._ | ||
| import chisel3.util._ | ||
| import vta.util.config._ | ||
| import vta.shell._ | ||
|
|
||
| /** Compute. | ||
| * | ||
| * The compute unit is in charge of the following: | ||
| * - Loading micro-ops from memory (loadUop module) | ||
| * - Loading biases (acc) from memory (tensorAcc module) | ||
| * - Compute ALU instructions (tensorAlu module) | ||
| * - Compute GEMM instructions (tensorGemm module) | ||
| */ | ||
| class Compute(debug: Boolean = false)(implicit p: Parameters) extends Module { | ||
| val mp = p(ShellKey).memParams | ||
| val io = IO(new Bundle { | ||
| val i_post = Vec(2, Input(Bool())) | ||
| val o_post = Vec(2, Output(Bool())) | ||
| val inst = Flipped(Decoupled(UInt(INST_BITS.W))) | ||
| val uop_baddr = Input(UInt(mp.addrBits.W)) | ||
| val acc_baddr = Input(UInt(mp.addrBits.W)) | ||
| val vme_rd = Vec(2, new VMEReadMaster) | ||
| val inp = new TensorMaster(tensorType = "inp") | ||
| val wgt = new TensorMaster(tensorType = "wgt") | ||
| val out = new TensorMaster(tensorType = "out") | ||
| val finish = Output(Bool()) | ||
| }) | ||
| val sIdle :: sSync :: sExe :: Nil = Enum(3) | ||
| val state = RegInit(sIdle) | ||
|
|
||
| val s = Seq.tabulate(2)(_ => Module(new Semaphore(counterBits = 8, counterInitValue = 0))) | ||
|
|
||
| val loadUop = Module(new LoadUop) | ||
| val tensorAcc = Module(new TensorLoad(tensorType = "acc")) | ||
| val tensorGemm = Module(new TensorGemm) | ||
| val tensorAlu = Module(new TensorAlu) | ||
|
|
||
| val inst_q = Module(new Queue(UInt(INST_BITS.W), p(CoreKey).instQueueEntries)) | ||
|
|
||
| // decode | ||
| val dec = Module(new ComputeDecode) | ||
| dec.io.inst := inst_q.io.deq.bits | ||
|
|
||
| val inst_type = Cat(dec.io.isFinish, | ||
| dec.io.isAlu, | ||
| dec.io.isGemm, | ||
| dec.io.isLoadAcc, | ||
| dec.io.isLoadUop).asUInt | ||
|
|
||
| val sprev = inst_q.io.deq.valid & Mux(dec.io.pop_prev, s(0).io.sready, true.B) | ||
| val snext = inst_q.io.deq.valid & Mux(dec.io.pop_next, s(1).io.sready, true.B) | ||
| val start = snext & sprev | ||
| val done = | ||
| MuxLookup(inst_type, | ||
| false.B, // default | ||
| Array( | ||
| "h_01".U -> loadUop.io.done, | ||
| "h_02".U -> tensorAcc.io.done, | ||
| "h_04".U -> tensorGemm.io.done, | ||
| "h_08".U -> tensorAlu.io.done, | ||
| "h_10".U -> true.B // Finish | ||
| ) | ||
| ) | ||
|
|
||
| // control | ||
| switch (state) { | ||
| is (sIdle) { | ||
| when (start) { | ||
| when (dec.io.isSync) { | ||
| state := sSync | ||
| } .elsewhen (inst_type.orR) { | ||
| state := sExe | ||
| } | ||
| } | ||
| } | ||
| is (sSync) { | ||
| state := sIdle | ||
| } | ||
| is (sExe) { | ||
| when (done) { | ||
| state := sIdle | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // instructions | ||
vegaluisjose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| inst_q.io.enq <> io.inst | ||
| inst_q.io.deq.ready := (state === sExe & done) | (state === sSync) | ||
|
|
||
| // uop | ||
| loadUop.io.start := state === sIdle & start & dec.io.isLoadUop | ||
| loadUop.io.inst := inst_q.io.deq.bits | ||
| loadUop.io.baddr := io.uop_baddr | ||
| io.vme_rd(0) <> loadUop.io.vme_rd | ||
| loadUop.io.uop.idx <> Mux(dec.io.isGemm, tensorGemm.io.uop.idx, tensorAlu.io.uop.idx) | ||
|
|
||
| // acc | ||
| tensorAcc.io.start := state === sIdle & start & dec.io.isLoadAcc | ||
| tensorAcc.io.inst := inst_q.io.deq.bits | ||
| tensorAcc.io.baddr := io.acc_baddr | ||
| tensorAcc.io.tensor.rd.idx <> Mux(dec.io.isGemm, tensorGemm.io.acc.rd.idx, tensorAlu.io.acc.rd.idx) | ||
| tensorAcc.io.tensor.wr <> Mux(dec.io.isGemm, tensorGemm.io.acc.wr, tensorAlu.io.acc.wr) | ||
| io.vme_rd(1) <> tensorAcc.io.vme_rd | ||
|
|
||
| // gemm | ||
| tensorGemm.io.start := state === sIdle & start & dec.io.isGemm | ||
| tensorGemm.io.inst := inst_q.io.deq.bits | ||
| tensorGemm.io.uop.data.valid := loadUop.io.uop.data.valid & dec.io.isGemm | ||
| tensorGemm.io.uop.data.bits <> loadUop.io.uop.data.bits | ||
| tensorGemm.io.inp <> io.inp | ||
| tensorGemm.io.wgt <> io.wgt | ||
| tensorGemm.io.acc.rd.data.valid := tensorAcc.io.tensor.rd.data.valid & dec.io.isGemm | ||
| tensorGemm.io.acc.rd.data.bits <> tensorAcc.io.tensor.rd.data.bits | ||
| tensorGemm.io.out.rd.data.valid := io.out.rd.data.valid & dec.io.isGemm | ||
| tensorGemm.io.out.rd.data.bits <> io.out.rd.data.bits | ||
|
|
||
| // alu | ||
| tensorAlu.io.start := state === sIdle & start & dec.io.isAlu | ||
| tensorAlu.io.inst := inst_q.io.deq.bits | ||
| tensorAlu.io.uop.data.valid := loadUop.io.uop.data.valid & dec.io.isAlu | ||
| tensorAlu.io.uop.data.bits <> loadUop.io.uop.data.bits | ||
| tensorAlu.io.acc.rd.data.valid := tensorAcc.io.tensor.rd.data.valid & dec.io.isAlu | ||
| tensorAlu.io.acc.rd.data.bits <> tensorAcc.io.tensor.rd.data.bits | ||
| tensorAlu.io.out.rd.data.valid := io.out.rd.data.valid & dec.io.isAlu | ||
| tensorAlu.io.out.rd.data.bits <> io.out.rd.data.bits | ||
|
|
||
| // out | ||
| io.out.rd.idx <> Mux(dec.io.isGemm, tensorGemm.io.out.rd.idx, tensorAlu.io.out.rd.idx) | ||
| io.out.wr <> Mux(dec.io.isGemm, tensorGemm.io.out.wr, tensorAlu.io.out.wr) | ||
|
|
||
| // semaphore | ||
| s(0).io.spost := io.i_post(0) | ||
| s(1).io.spost := io.i_post(1) | ||
| s(0).io.swait := dec.io.pop_prev & (state === sIdle & start) | ||
| s(1).io.swait := dec.io.pop_next & (state === sIdle & start) | ||
| io.o_post(0) := dec.io.push_prev & ((state === sExe & done) | (state === sSync)) | ||
| io.o_post(1) := dec.io.push_next & ((state === sExe & done) | (state === sSync)) | ||
|
|
||
| // finish | ||
| io.finish := state === sExe & done & dec.io.isFinish | ||
|
|
||
| // debug | ||
| if (debug) { | ||
| // start | ||
| when (state === sIdle && start) { | ||
| when (dec.io.isSync) { | ||
vegaluisjose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| printf("[Compute] start sync\n") | ||
| } .elsewhen (dec.io.isLoadUop) { | ||
| printf("[Compute] start load uop\n") | ||
| } .elsewhen (dec.io.isLoadAcc) { | ||
| printf("[Compute] start load acc\n") | ||
| } .elsewhen (dec.io.isGemm) { | ||
| printf("[Compute] start gemm\n") | ||
| } .elsewhen (dec.io.isAlu) { | ||
| printf("[Compute] start alu\n") | ||
| } .elsewhen (dec.io.isFinish) { | ||
| printf("[Compute] start finish\n") | ||
| } | ||
vegaluisjose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
| // done | ||
| when (state === sSync) { | ||
| printf("[Compute] done sync\n") | ||
| } | ||
| when (state === sExe) { | ||
| when (done) { | ||
| when (dec.io.isLoadUop) { | ||
| printf("[Compute] done load uop\n") | ||
| } .elsewhen (dec.io.isLoadAcc) { | ||
| printf("[Compute] done load acc\n") | ||
| } .elsewhen (dec.io.isGemm) { | ||
| printf("[Compute] done gemm\n") | ||
| } .elsewhen (dec.io.isAlu) { | ||
| printf("[Compute] done alu\n") | ||
| } .elsewhen (dec.io.isFinish) { | ||
| printf("[Compute] done finish\n") | ||
| } | ||
vegaluisjose marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
| } | ||
| } | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.