<html>#2844: HelloWorldCuda
<table style='border-spacing: 1ex 0pt; '>
<tr><td style='text-align:right'> Reporter:</td><td>Anthony Shoup</td></tr>
<tr><td style='text-align:right'>   Status:</td><td>submitted</td></tr>
<tr><td style='text-align:right'>Milestone:</td><td>ET_2024_11</td></tr>
<tr><td style='text-align:right'>  Version:</td><td></td></tr>
<tr><td style='text-align:right'>     Type:</td><td>bug</td></tr>
<tr><td style='text-align:right'> Priority:</td><td>minor</td></tr>
<tr><td style='text-align:right'>Component:</td><td>EinsteinToolkit thorn</td></tr>
</table>

<p>The HelloWorldCuda thorn does not seem to work in my build of the Einstein Toolkit, ET_2024_11 release. The routine, HelloWorldCUDA_evol, that launches the cuda kernel uses cctk_lsh grid size values that are very large (x-dir > 1x10^9), which then tries to launch too many kernels.  Below is the code:</p>
<p>extern "C"<br>
  void HelloWorldCUDA_evol(CCTK_ARGUMENTS)<br>
  {<br>
    DECLARE_CCTK_ARGUMENTS;</p>
<div class="codehilite"><pre><span></span><code><span class="n">const</span> <span class="nb">int</span> <span class="n">val1</span> <span class="o">=</span> <span class="n">cctk_iteration</span><span class="p">;</span>
<span class="n">const</span> <span class="nb">int</span> <span class="n">val2</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>
<span class="nb">int</span> <span class="n">res</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>                 <span class="o">//</span> <span class="n">poison</span>

<span class="n">const</span> <span class="n">dim3</span> <span class="n">blockDim</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">);</span>
</code></pre></div>

<p>//    const dim3 gridDim((cctk_lsh[0] + blockDim.x - 1) / blockDim.x,  // Commented out by ALS<br>
//                       (cctk_lsh[1] + blockDim.y - 1) / blockDim.y,<br>
//                       (cctk_lsh[2] + blockDim.z - 1) / blockDim.z);</p>
<div class="codehilite"><pre><span></span><code><span class="o">//</span> <span class="n">This</span> <span class="n">code</span> <span class="n">does</span> <span class="k">work</span> <span class="o">-</span> <span class="n">ALS</span>
<span class="n">const</span> <span class="n">dim3</span> <span class="n">gridDim</span><span class="p">((</span><span class="mi">8</span> <span class="o">+</span> <span class="n">blockDim</span><span class="p">.</span><span class="n">x</span> <span class="o">-</span> <span 
<p>--<br/>
Ticket URL: <a href='https://bitbucket.org/einsteintoolkit/tickets/issues/2844/helloworldcuda'>https://bitbucket.org/einsteintoolkit/tickets/issues/2844/helloworldcuda</a></p>
</html>